Customized ThreadPoolExecutor

Customized ThreadPoolExecutor - java

I am writing a customized ThreadPoolExecutor with extra features as:-
If number of threads are more than core pool size but less than max pool size and queue is not full and there are no ideal threads then create a new thread for a task.
If there are ideal threads and as task comes assign that task to queue rather than adding it to the queue.
If all threads (upto max pool size are busy) then as new task come add them to the queue using reject method of RejectionHandler
I have overridden execute method of ThreadPoolExecutor version java 1.5.
The new code is as follows:-
public void execute(Runnable command) {
System.out.println(" Active Count: "+getActiveCount()+" PoolSize: "+getPoolSize()+" Idle Count: "+(getPoolSize()-getActiveCount())+" Queue Size: "+getQueue().size());
if (command == null)
throw new NullPointerException();
for (;;) {
if (runState != RUNNING) {
reject(command);
return;
}
if (poolSize < corePoolSize && addIfUnderCorePoolSize(command)) {
return;
}
if (runState == RUNNING && (getPoolSize()-getActiveCount() != 0) && workQueue.offer(command)) {
return;
}
int status = addIfUnderMaximumPoolSize(command);
if (status > 0) // created new thread
return;
if (status == 0) { // failed to create thread
reject(command);
return;
}
if (workQueue.offer(command))
return;
// Retry if created a new thread but it is busy with another task
}
}
The legacy code is as below:-
public void execute(Runnable command) {
if (command == null)
throw new NullPointerException();
for (;;) {
if (runState != RUNNING) {
reject(command);
return;
}
if (poolSize < corePoolSize && addIfUnderCorePoolSize(command))
return;
if (workQueue.offer(command))
return;
int status = addIfUnderMaximumPoolSize(command);
if (status > 0) // created new thread
return;
if (status == 0) { // failed to create thread
reject(command);
return;
}
// Retry if created a new thread but it is busy with another task
}
}
The problem which is now getting generated is that its not creating new threads when threads are idle but its not even allocating tasks to those threads else it is adding them to the queue which is not desired as we don't want the task to wait but process it asap even if it requires new thread creation but waiting is not allowed for a task.
PLEASE HELP ME IN RESOLVING THIS ISSUE. Thanks.

If I understand the question, I believe that I've found a solution to the default behavior of the ThreadPoolExecutor that I show in my answer here:
How to get the ThreadPoolExecutor to increase threads to max before queueing?
Basically you LinkedBlockingQueue to have it always return false for queue.offer(...) which will add an additional threads to the pool, if necessary. If the pool is already at max threads and they all are busy, the RejectedExecutionHandler will be called. It is the handler which then does the put(...) into the queue.
See my code there.

As much as i have understand of the three functionality you described, I think using ExectuorService would do much more than what you are currently trying to do: An Executor that provides methods to manage termination and methods that can produce a Future for tracking progress of one or more asynchronous tasks, specially with:
1.catched thread pool: allows createing as many threads it needs to execute the task in parrallel. The old available threads will be reused for the new tasks and Fixed thread pool.
2.Fixed Thread Pool : provides a pool with fixed number of threads. If a thread is not available for the task, the task is put in queue waiting for an other task to ends.
Check out this article for detail explanation and nice example.

Related

ThreadPool getActiveCount() vs getPoolSize()

Although this topic has been discussed broadly in other posts I want to present my use case and clarify .So apologies if I am wasting anyone's time. I have the following runnable implementation. Basically infinitely running thread unless java.lang.Error gets thrown by the business logic.
public void run (){
while(true){
try{
//business logic
}catch(Exception ex){
}
}
}
I have about 30 of the above threads started from ExecutorService.
private final ExecutorService normalPriorityExecutorService = Executors.newFixedThreadPool(30);
for(int i=0;i<30;i++) {
normalPriorityExecutorService.submit(//Above Runnable);
}
I want to check and kill the JVM process if the thread count becomes zero on this Executor Service.
if (normalPriorityExecutorService instanceof ThreadPoolExecutor && ((ThreadPoolExecutor) normalPriorityExecutorService).getActiveCount() ==0) {
log.error("No Normal Priority response listeners available. Shutting down App!");
System.exit(1);
}
From my reading since these runnable threads are infinitely running under normal circumstances I will have 30 of them active unless they get killed by runtime Errors.
Question is using getActiveCount() the right approach for my use case ? By the way, when I tried using getPoolSize() instead of getActiveCount(), I did not get the right behavior while testing (I forcefully threw an error to kill a specific thread) and the poolSize still remained thirty.

Since you never use the thread pool as a pool, using a thread pool is overkill. Just create a thread group and start your threads.
private final ThreadGroup normalPriorityThreadGroup = new ThreadGroup("NormalPriority");
for (int i = 0; i < 30; i++) {
new Thread(this.normalPriorityThreadGroup, runnable, "NormalPriority-" + 1).start();
}
if (this.normalPriorityThreadGroup.activeCount() == 0) {
log.error("No Normal Priority response listeners available. Shutting down App!");
System.exit(1);
}

Inserting million data into DB using multithreading

I am trying to insert millions of data rows into a Database. I am trying to use ThreadPoolExecutor for this purpose. I am creating a batch for every 9000 records and sending the batch to each thread. Here I fixed the ThreadPool Size to 20. After the size increases it is getting failed. How can I check how many threads are available in the ThreadPoolExecutor and how can I wait till the thread pool has free threads.
Hear is my code, Please help if i am wrong.
int threadCount=10;
ThreadPoolExecutor threadPool = (ThreadPoolExecutor) Executors.newFixedThreadPool(threadCount);
int i=0;
StringBuffer sb=new StringBuffer();
sb.append("BEGIN BATCH");
sb.append(System.lineSeparator());
int cnt =metaData.getColumnCount();
while(rs.next())
{
String query ="INSERT INTO "+table+" ("+columnslist.get(1)+")VALUES("+i;
for ( int j=1 ; j <= cnt ; j++)
{
if(metaData.getColumnTypeName(j).contains("int") || metaData.getColumnTypeName(j).contains("number"))
{
query +=","+ rs.getInt(j);
}
else if(metaData.getColumnTypeName(j).contains("varchar") || metaData.getColumnTypeName(j).contains("date") || metaData.getColumnTypeName(j).contains("getTimestamp"))
{
query +=",'"+parseColumnData(rs.getString(j))+"'";
}
else
{
query +=",'"+parseColumnData(rs.getString(j))+"'";
}
}
query +=");";
sb.append(query);sb.append(System.lineSeparator());
if(i%9000==0)
{
sb.append("APPLY BATCH");
System.out.println(threadPool.getActiveCount());
Thread t = new Thread(new ExcecuteTask(sb.toString(),session));
threadPool.execute(t);
sb.setLength(0);
sb.append("BEGIN BATCH");
sb.append(System.lineSeparator());
}
i++;
}
sb.append("APPLY BATCH");
Thread t = new Thread(new ExcecuteTask(sb.toString(),session));
threadPool.execute(t);
sb.setLength(0);
threadPool.shutdown();
while (threadPool.getTaskCount() != threadPool.getCompletedTaskCount())
{
}
System.out.println(table+" Loaded sucessfully");
public class ExcecuteTask implements Runnable
{
private String sb;
private Session session;
public ExcecuteTask(String s,Session session)
{
sb = s;
this.session=session;
}
public void run()
{
session.executeAsync(sb.toString());
}
}

You can find the approximate number of active threads in the ThreadPoolExecutor by calling the getActiveCount method on it. However you shouldn't need to.
From the Java documentation for Executors.newFixedThreadPool
Creates a thread pool that reuses a fixed number of threads operating off a shared unbounded queue. At any point, at most nThreads threads will be active processing tasks. If additional tasks are submitted when all threads are active, they will wait in the queue until a thread is available. If any thread terminates due to a failure during execution prior to shutdown, a new one will take its place if needed to execute subsequent tasks. The threads in the pool will exist until it is explicitly shutdown.
So you should be able to keep submitting tasks to the thread pool and they will be picked up and run as threads become available.
I also note that you are wrapping your tasks in Thread objects before submitting them to the thread pool which is not necessary.

Semaphore-implemented Producer-Consumer oriented thread pool

I am currently working on an educational assignment in which i have to implement a Semaphore only thread-safe thread pool.
I mustn't use during my assignment: Synchronize wait notify sleep or any thread-safe API's.
firstly without getting too much into the code i have:
Implemented a Thread-safe queue (no two threads can queue\dequeue at the same time) (i have tested this problem with ConcurrentLinkedQueue and the problem persists)
The design itself:
Shared:
Tasks semaphore = 0
Available semaphore = 0
Tasks_Queue queue
Available_Queue queue
Worker Threads:
Blocked semaphore = 0
General Info:
Only manager(single thread) can dequeue Tasks_Queue and Available_Queue
Only App-Main(single thread) can enqueue tasks is Tasks_Queue
Each worker thread can enqueue themselves in Available_Queue
So we have a mix of a single producer, a single manager and several consumers.
When the app first starts each of the worker threads gets started and immediately enqueues itself in Available_Queue, releases Available semaphore and gets blocked acquiring it's personal Blocked semaphore.
Whenever App-Main queues a new task it releases Task Semaphore
Whenever Manager wishes to execute a new task it must first acquire both Tasks and Available semaphores.
My question:
during the app's runtime the function dequeue_worker() returns a null worker, even though a semaphore is placed to protect access to the queue when it is known that there are no available worker threads.
i have "solved" the problem by recursively calling dequeue_worker() if it draws a null thread, BUT doing so is suppose to make an acquisition of a semaphore permit lost forever. yet when i limit the amount of workers to 1 the worker does not get blocked forever.
1) what's the break-point of my original design?
2) why doesn't my "solution" break the design even further?!
Code snippets:
// only gets called by Worker threads: enqueue_worker(this);
private void enqueue_worker(Worker worker) {
available_queue.add(worker);
available.release();
}
// only gets called by App-Main (a single thread)
public void enqueue_task(Query query) {
tasks_queue.add(query);
tasks.release();
}
// only gets called by Manager(a single Thread)
private Worker dequeue_worker() {
Worker worker = null;
try {
available.acquire();
worker = available_queue.poll();
} catch (InterruptedException e) {
// shouldn't happen
} // **** the solution: ****
if (worker==null) worker = dequeue_worker(); // TODO: find out why
return worker;
}
// only gets called by Manager(a single Thread)
private Query dequeue_task() {
Query query = null;
try {
tasks.acquire();
query = tasks_queue.poll();
} catch (InterruptedException e) {
// shouldn't happen
}
return query;
}
// gets called by Manager (a single thread)
private void execute() { // check if task is available and executes it
Worker worker = dequeue_worker(); // available.down()
Query query = dequeue_task(); //task.down()
worker.setData(query);
worker.blocked.release();
}
And finally Worker's Run() method:
while (true) { // main infinite loop
enqueue_worker(this);
acquire(); // blocked.acquire();
<C.S>
available.release();
}

You are calling available.release() twice, once in enqueue_worker, second time in a main loop.

how to deal with multiple worker threads that may create new work items

I have a queue that contains work items and I want to have multiple threads work in parallel on those items. When a work item is processed it may result in new work items. The problem I have is that I can't find a solution on how to determine if I'm done. The worker looks like that:
public class Worker implements Runnable {
public void run() {
while (true) {
WorkItem item = queue.nextItem();
if (item != null) {
processItem(item);
}
else {
// the queue is empty, but there may still be other workers
// processing items which may result in new work items
// how to determine if the work is completely done?
}
}
}
}
This seems like a pretty simple problem actually but I'm at a loss. What would be the best way to implement that?
thanks
clarification:
The worker threads have to terminate once none of them is processing an item, but as long as at least one of them is still working they have to wait because it may result in new work items.

What about using an ExecutorService which will allow you to wait for all tasks to finish: ExecutorService, how to wait for all tasks to finish

I'd suggest wait/notify calls. In the else case, your worker threads would wait on an object until notified by the queue that there is more work to do. When a worker creates a new item, it adds it to the queue, and the queue calls notify on the object the workers are waiting on. One of them will wake up to consume the new item.
The methods wait, notify, and notifyAll of class Object support an efficient transfer of control from one thread to another. Rather than simply "spinning" (repeatedly locking and unlocking an object to see whether some internal state has changed), which consumes computational effort, a thread can suspend itself using wait until such time as another thread awakens it using notify. This is especially appropriate in situations where threads have a producer-consumer relationship (actively cooperating on a common goal) rather than a mutual exclusion relationship (trying to avoid conflicts while sharing a common resource).
Source: Threads and Locks

I'd look at something higher level than wait/notify. It's very difficult to get right and avoid deadlocks. Have you looked at java.util.concurrent.CompletionService<V>? You could have a simpler manager thread that polls the service and take()s the results, which may or may not contain a new work item.

Using a BlockingQueue containing items to process along with a synchronized set that keeps track of all elements being processed currently:
BlockingQueue<WorkItem> bQueue;
Set<WorkItem> beingProcessed = new Collections.synchronizedSet(new HashMap<WorkItem>());
bQueue.put(workItem);
...
// the following runs over many threads in parallel
while (!(bQueue.isEmpty() && beingProcessed.isEmpty())) {
WorkItem currentItem = bQueue.poll(50L, TimeUnit.MILLISECONDS); // null for empty queue
if (currentItem != null) {
beingProcessed.add(currentItem);
processItem(currentItem); // possibly bQueue.add(newItem) is called from processItem
beingProcessed.remove(currentItem);
}
}
EDIT: as #Hovercraft Full Of Eels suggested, an ExecutorService is probably what you should really use. You can add new tasks as you go along. You can semi-busy wait for termination of all tasks at regular interval with executorService.awaitTermination(time, timeUnits) and kill all your threads after that.

Here's the beginnings of a queue to solve your problem. bascially, you need to track new work and in process work.
public class WorkQueue<T> {
private final List<T> _newWork = new LinkedList<T>();
private int _inProcessWork;
public synchronized void addWork(T work) {
_newWork.add(work);
notifyAll();
}
public synchronized T startWork() throws InterruptedException {
while(_newWork.isEmpty() && (_inProcessWork > 0)) {
wait();
if(!_newWork.isEmpty()) {
_inProcessWork++;
return _newWork.remove(0);
}
}
// everything is done
return null;
}
public synchronized void finishWork() {
_inProcessWork--;
if((_inProcessWork == 0) && _newWork.isEmpty()) {
notifyAll();
}
}
}
your workers will look roughly like:
public class Worker {
private final WorkQueue<T> _queue;
public void run() {
T work = null;
while((work = _queue.startWork()) != null) {
try {
// do work here...
} finally {
_queue.finishWork();
}
}
}
}
the one trick is that you need to add the first work item _before you start any workers (otherwise they will all immediately exit).

Is adding tasks to BlockingQueue of ThreadPoolExecutor advisable?

The JavaDoc for ThreadPoolExecutor is unclear on whether it is acceptable to add tasks directly to the BlockingQueue backing the executor. The docs say calling executor.getQueue() is "intended primarily for debugging and monitoring".
I'm constructing a ThreadPoolExecutor with my own BlockingQueue. I retain a reference to the queue so I can add tasks to it directly. The same queue is returned by getQueue() so I assume the admonition in getQueue() applies to a reference to the backing queue acquired through my means.
Example
General pattern of the code is:
int n = ...; // number of threads
queue = new ArrayBlockingQueue<Runnable>(queueSize);
executor = new ThreadPoolExecutor(n, n, 1, TimeUnit.HOURS, queue);
executor.prestartAllCoreThreads();
// ...
while (...) {
Runnable job = ...;
queue.offer(job, 1, TimeUnit.HOURS);
}
while (jobsOutstanding.get() != 0) {
try {
Thread.sleep(...);
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
}
}
executor.shutdownNow();
queue.offer() vs executor.execute()
As I understand it, the typical use is to add tasks via executor.execute(). The approach in my example above has the benefit of blocking on the queue whereas execute() fails immediately if the queue is full and rejects my task. I also like that submitting jobs interacts with a blocking queue; this feels more "pure" producer-consumer to me.
An implication of adding tasks to the queue directly: I must call prestartAllCoreThreads() otherwise no worker threads are running. Assuming no other interactions with the executor, nothing will be monitoring the queue (examination of ThreadPoolExecutor source confirms this). This also implies for direct enqueuing that the ThreadPoolExecutor must additionally be configured for > 0 core threads and mustn't be configured to allow core threads to timeout.
tl;dr
Given a ThreadPoolExecutor configured as follows:
core threads > 0
core threads aren't allowed to timeout
core threads are prestarted
hold a reference to the BlockingQueue backing the executor
Is it acceptable to add tasks directly to the queue instead of calling executor.execute()?
Related
This question ( producer/consumer work queues ) is similar, but doesn't specifically cover adding to the queue directly.

One trick is to implement a custom subclass of ArrayBlockingQueue and to override the offer() method to call your blocking version, then you can still use the normal code path.
queue = new ArrayBlockingQueue<Runnable>(queueSize) {
#Override public boolean offer(Runnable runnable) {
try {
return offer(runnable, 1, TimeUnit.HOURS);
} catch(InterruptedException e) {
// return interrupt status to caller
Thread.currentThread().interrupt();
}
return false;
}
};
(as you can probably guess, i think calling offer directly on the queue as your normal code path is probably a bad idea).

If it were me, I would prefer using Executor#execute() over Queue#offer(), simply because I'm using everything else from java.util.concurrent already.
Your question is a good one, and it piqued my interest, so I took a look at the source for ThreadPoolExecutor#execute():
public void execute(Runnable command) {
if (command == null)
throw new NullPointerException();
if (poolSize >= corePoolSize || !addIfUnderCorePoolSize(command)) {
if (runState == RUNNING && workQueue.offer(command)) {
if (runState != RUNNING || poolSize == 0)
ensureQueuedTaskHandled(command);
}
else if (!addIfUnderMaximumPoolSize(command))
reject(command); // is shutdown or saturated
}
}
We can see that execute itself calls offer() on the work queue, but not before doing some nice, tasty pool manipulations if necessary. For that reason, I'd think that it'd be advisable to use execute(); not using it may (although I don't know for certain) cause the pool to operate in a non-optimal way. However, I don't think that using offer() will break the executor - it looks like tasks are pulled off the queue using the following (also from ThreadPoolExecutor):
Runnable getTask() {
for (;;) {
try {
int state = runState;
if (state > SHUTDOWN)
return null;
Runnable r;
if (state == SHUTDOWN) // Help drain queue
r = workQueue.poll();
else if (poolSize > corePoolSize || allowCoreThreadTimeOut)
r = workQueue.poll(keepAliveTime, TimeUnit.NANOSECONDS);
else
r = workQueue.take();
if (r != null)
return r;
if (workerCanExit()) {
if (runState >= SHUTDOWN) // Wake up others
interruptIdleWorkers();
return null;
}
// Else retry
} catch (InterruptedException ie) {
// On interruption, re-check runState
}
}
}
This getTask() method is just called from within a loop, so if the executor's not shutting down, it'd block until a new task was given to the queue (regardless of from where it came from).
Note: Even though I've posted code snippets from source here, we can't rely on them for a definitive answer - we should only be coding to the API. We don't know how the implementation of execute() will change over time.

One can actually configure behavior of the pool when the queue is full, by specifying a RejectedExecutionHandler at instantiation. ThreadPoolExecutor defines four policies as inner classes, including AbortPolicy, DiscardOldestPolicy, DiscardPolicy, as well as my personal favorite, CallerRunsPolicy, which runs the new job in the controlling thread.
For example:
ThreadPoolExecutor threadPool = new ThreadPoolExecutor(
nproc, // core size
nproc, // max size
60, // idle timeout
TimeUnit.SECONDS,
new ArrayBlockingQueue<Runnable>(4096, true), // Fairness = true guarantees FIFO
new ThreadPoolExecutor.CallerRunsPolicy() ); // If we have to reject a task, run it in the calling thread.
The behavior desired in the question can be obtained using something like:
public class BlockingPolicy implements RejectedExecutionHandler {
void rejectedExecution(Runnable r, ThreadPoolExecutor executor) {
executor.getQueue.put(r); // Self contained, no queue reference needed.
}
At some point the queue must be accessed. The best place to do so is in a self-contained RejectedExecutionHandler, which saves any code duplication or potenial bugs arising from direct manipulation of the queue at the scope of the pool object. Note that the handlers included in ThreadPoolExecutor themselves use getQueue().

It's a very important question if the queue you're using is a completely different implementation from the standard in-memory LinkedBlockingQueue or ArrayBlockingQueue.
For instance if you're implementing the producer-consumer pattern using several producers on different machines, and use a queuing mechanism based on a separate persistence subsystem (like Redis), then the question becomes relevant on its own, even if you don't want a blocking offer() like the OP.
So the given answer, that prestartAllCoreThreads() has to be called (or enough times prestartCoreThread()) for the worker threads to be available and running, is important enough to be stressed.

If required, we can also use a parking lot which separates main processing from rejected tasks -
final CountDownLatch taskCounter = new CountDownLatch(TASKCOUNT);
final List<Runnable> taskParking = new LinkedList<Runnable>();
BlockingQueue<Runnable> taskPool = new ArrayBlockingQueue<Runnable>(1);
RejectedExecutionHandler rejectionHandler = new RejectedExecutionHandler() {
#Override
public void rejectedExecution(Runnable r, ThreadPoolExecutor executor) {
System.err.println(Thread.currentThread().getName() + " -->rejection reported - adding to parking lot " + r);
taskCounter.countDown();
taskParking.add(r);
}
};
ThreadPoolExecutor threadPoolExecutor = new ThreadPoolExecutor(5, 10, 1000, TimeUnit.SECONDS, taskPool, rejectionHandler);
for(int i=0 ; i<TASKCOUNT; i++){
//main
threadPoolExecutor.submit(getRandomTask());
}
taskCounter.await(TASKCOUNT * 5 , TimeUnit.SECONDS);
System.out.println("Checking the parking lot..." + taskParking);
while(taskParking.size() > 0){
Runnable r = taskParking.remove(0);
System.out.println("Running from parking lot..." + r);
if(taskParking.size() > LIMIT){
waitForSometime(...);
}
threadPoolExecutor.submit(r);
}
threadPoolExecutor.shutdown();

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Customized ThreadPoolExecutor - java

Related

ThreadPool getActiveCount() vs getPoolSize()

Inserting million data into DB using multithreading

Semaphore-implemented Producer-Consumer oriented thread pool

how to deal with multiple worker threads that may create new work items

Is adding tasks to BlockingQueue of ThreadPoolExecutor advisable?

Categories

Resources