In the system, I have an object - let's call it TaskProcessor. It holds queue of tasks, which are executed by some pool of threads (ExecutorService + PriorityBlockingQueue)
The result of each task is saved in the database under some unique identifier.
The user, who knows this unique identifier, may check the result of this task. The result could be in the database, but also the task could still wait in the queue for execution. In that case, UserThread should wait until the task will be finished.
Additionally, the following assumptions are valid:
Someone else could enqueue the task to TaskProcessor and some random UserThread can access the result if he knows the unique identifier.
UserThread and TaskProcess are in the same app. TaskProcessor contains a pool of threads, and UserThread is simply servlet Thread.
UserThread should be blocked when asking for the result, and the result is not completed yet. UserThread should be unblocked immediately after TaskProcessor complete task (or tasks) grouped by a unique identifier
My first attempt (the naive one), was to check the result in the loop and sleep for some time:
// UserThread
while(!checkResultIsInDatabase(uniqueIdentifier))
sleep(someTime)
But I don't like it. First of all, I am wasting database connections. Moreover, if the task would be finished right after sleep, then the user will wait even if the result just appeared.
Next attempt was based on wait/notify:
//UserThread
while (!checkResultIsInDatabase())
taskProcessor.wait()
//TaskProcessor
... some complicated calculations
this.notifyAll()
But I don't like it either. If more UserThreads will use TaskProcessor, then they will be wakened up unnecessarily every time some task would be completed and moreover - they will make unnecessary database calls.
The last attempt was based on something which I called waitingRoom:
//UserThread
Object mutex = new Object();
taskProcessor.addToWaitingRoom(uniqueIdentifier, mutex)
while (!checkResultIsInDatabase())
mutex.wait()
//TaskProcessor
... Some complicated calculations
if (uniqueIdentifierExistInWaitingRoom(taskUniqueIdentifier))
getMutexFromWaitingRoom(taskUniqueIdentifier).notify()
But it seems to be not secure. Between database check and wait(), the task could be completed (notify() wouldn't be effective because UserThread didn't invoke wait() yet), which may end up with deadlock.
It seems, that I should synchronize it somewhere. But I am afraid that it will be not effective.
Is there a way to correct any of my attempts, to make them secure and effective? Or maybe there is some other, better way to do this?
You seem to be looking for some sort of future / promise abstraction. Take a look at CompletableFuture, available since Java 8.
CompletableFuture<Void> future = CompletableFuture.runAsync(db::yourExpensiveOperation, executor);
// best approach: attach some callback to run when the future is complete, and handle any errors
future.thenRun(this::onSuccess)
.exceptionally(ex -> logger.error("err", ex));
// if you really need the current thread to block, waiting for the async result:
future.join(); // blocking! returns the result when complete or throws a CompletionException on error
You can also return a (meaningful) value from your async operation and pass the result to the callback. To make use of this, take a look at supplyAsync(), thenAccept(), thenApply(), whenComplete() and the like.
You can also combine multiple futures into one and a lot more.
I believe replacing of mutex with CountDownLatch in waitingRoom approach prevents deadlock.
CountDownLatch latch = new CountDownLatch(1)
taskProcessor.addToWaitingRoom(uniqueIdentifier, latch)
while (!checkResultIsInDatabase())
// consider timed version
latch.await()
//TaskProcessor
... Some complicated calculations
if (uniqueIdentifierExistInWaitingRoom(taskUniqueIdentifier))
getLatchFromWaitingRoom(taskUniqueIdentifier).countDown()
With CompletableFuture and a ConcurrentHashMap you can achieve it:
/* Server class, i.e. your TaskProcessor */
// Map of queued tasks (either pending or ongoing)
private static final ConcurrentHashMap<String, CompletableFuture<YourTaskResult>> tasks = new ConcurrentHashMap<>();
// Launch method. By default, CompletableFuture uses ForkJoinPool which implicitly enqueues tasks.
private CompletableFuture<YourTaskResult> launchTask(final String taskId) {
return tasks.computeIfAbsent(taskId, v -> CompletableFuture // return ongoing task if any, or launch a new one
.supplyAsync(() ->
doYourThing(taskId)) // get from DB or calculate or whatever
.whenCompleteAsync((integer, throwable) -> {
if (throwable != null) {
log.error("Failed task: {}", taskId, throwable);
}
tasks.remove(taskId);
})
);
/* Client class, i.e. your UserThread */
// Usage
YourTaskResult taskResult = taskProcessor.launchTask(taskId).get(); // block until we get a result
Any time a user asks for the result of a taskId, they will either:
enqueue a new task if they are the first to ask for this taskId; or
get the result of the ongoing task with id taskId, if someone else enqueued it first.
This is production code currently used by hundreds of users concurrently.
In our app, users ask for any given file, via a REST endpoint (every user on its own thread). Our taskIds are filenames, and our doYourThing(taskId) retrieves the file from the local filesystem or downloads it from an S3 bucket if it doesn't exist.
Obviously we don't want to download the same file more than once. With this solution I implemented, any number of users can ask for the same file at the same or different times, and the file will be downloaded exactly once. All users that asked for it while it was downloading will get it at the same time the moment it finishes downloading; all users that ask for it later, will get it instantly from the local filesystem.
Works like a charm.
What I understood from the question details is-
When UserThread requests for result, there are 3 possibilities:
Task has been already completed so no blocking of user thread and directly get result from DB.
Task is in queue or executing but not yet completed, so block the user thread(till now there should not be any db queries) and just after completion of task(the task result must be saved in DB at this point), unblock user thread(now user thread can query the DB for result)
There is no task submitted ever for the given uniqueIdentifier which user has requested, in this case there will be empty result from db.
For point 1 and 3, Its straight forward, there will not be any blocking of UserThread, just query the result from DB.
For point 2 - I have written a simple implementation of TaskProcessor. Here I have used ConcurrentHashMap to keep the current tasks which are not yet completed. This map contains the mapping between UniqueIdentifier and corresponding task. I have used computeIfPresent() (introduced in JAVA - 1.8) method of ConcurrentHashMap which guarantees that the invocation of this method is thread safe for the same key. Below is what java doc says:
Link
If the value for the specified key is present, attempts to compute a
new mapping given the key and its current mapped value. The entire
method invocation is performed atomically. Some attempted update
operations on this map by other threads may be blocked while
computation is in progress, so the computation should be short and
simple, and must not attempt to update any other mappings of this map.
So with use of this method, whenever there is a user thread request for a task T1 and if the task T1 is in queue or executing but not completed yet, then user thread will wait on that task.
When the task T1 will be completed, all the user requests thread which were waiting on task T1 will be notified and then we will remove task T1 from the above map.
Other classes reference used in below code are present on this link.
TaskProcessor.java:
import java.util.Map;
import java.util.concurrent.*;
import java.util.concurrent.atomic.AtomicBoolean;
import java.util.function.BiFunction;
public class TaskProcessor implements ITaskProcessor {
//This map will contain all the tasks which are in queue and not yet completed
//If there is scenario where there may be multiple tasks corresponding to same uniqueIdentifier, in that case below map can be modified accordingly to have the list of corresponding tasks which are not completed yet
private final Map<String, Task> taskInProgresssByUniqueIdentifierMap = new ConcurrentHashMap<>();
private final int QUEUE_SIZE = 100;
private final BlockingQueue<Task> taskQueue = new ArrayBlockingQueue<Task>(QUEUE_SIZE);
private final TaskRunner taskRunner = new TaskRunner();
private Executor executor;
private AtomicBoolean isStarted;
private final DBManager dbManager = new DBManager();
#Override
public void start() {
executor = Executors.newCachedThreadPool();
while(isStarted.get()) {
try {
Task task = taskQueue.take();
executeTaskInSeperateThread(task);
} catch (InterruptedException e) {
e.printStackTrace();
}
}
}
private void executeTaskInSeperateThread(Task task) {
executor.execute(() -> {
taskRunner.execute(task, new ITaskProgressListener() {
#Override
public void onTaskCompletion(TaskResult taskResult) {
task.setCompleted(true);
//TODO: we can also propagate the taskResult to waiting users, Implement it if it is required.
notifyAllWaitingUsers(task);
}
#Override
public void onTaskFailure(Exception e) {
notifyAllWaitingUsers(task);
}
});
});
}
private void notifyAllWaitingUsers(Task task) {
taskInProgresssByUniqueIdentifierMap.computeIfPresent(task.getUniqueIdentifier(), new BiFunction<String, Task, Task>() {
#Override
public Task apply(String s, Task task) {
synchronized (task) {
task.notifyAll();
}
return null;
}
});
}
//User thread
#Override
public ITaskResult getTaskResult(String uniqueIdentifier) {
TaskResult result = null;
Task task = taskInProgresssByUniqueIdentifierMap.computeIfPresent(uniqueIdentifier, new BiFunction<String, Task, Task>() {
#Override
public Task apply(String s, Task task) {
synchronized (task) {
try {
//
task.wait();
} catch (InterruptedException e) {
e.printStackTrace();
}
}
return task;
}
});
//If task is null, it means the task was not there in queue, so we direcltly query the db for the task result
if(task != null && !task.isCompleted()) {
return null; // Handle this condition gracefully, If task is not completed, it means there was some exception
}
ITaskResult taskResult = getResultFromDB(uniqueIdentifier); // At this point the result must be already saved in DB if the corresponding task has been processed ever.
return taskResult;
}
private ITaskResult getResultFromDB(String uniqueIdentifier) {
return dbManager.getTaskResult(uniqueIdentifier);
}
//Other thread
#Override
public void enqueueTask(Task task) {
if(isStarted.get()) {
taskInProgresssByUniqueIdentifierMap.putIfAbsent(task.getUniqueIdentifier(), task);
taskQueue.offer(task);
}
}
#Override
public void stop() {
isStarted.compareAndSet(true, false);
}
}
Let me know in comments if you have any queries.
Thanks.
Related
I have a BlockingQueue of Runnable - I can simply execute all tasks using one of TaskExecutor implementations, and all will be run in parallel.
However some Runnable depends on others, it means they need to wait when Runnable finish, then they can be executed.
Rule is quite simple: every Runnable has a code. Two Runnable with the same code cannot be run simultanously, but if the code differ they should be run in parallel.
In other words all running Runnable need to have different code, all "duplicates" should wait.
The problem is that there's no event/method/whatsoever when thread ends.
I can built such notification into every Runnable, but I don't like this approach, because it will be done just before thread ends, not after it's ended
java.util.concurrent.ThreadPoolExecutor has method afterExecute, but it needs to be implemented - Spring use only default implementation, and this method is ignored.
Even if I do that, it's getting complicated, because I need to track two additional collections: with Runnables already executing (no implementation gives access to this information) and with those postponed because they have duplicated code.
I like the BlockingQueue approach because there's no polling, thread simply activate when something new is in the queue. But maybe there's a better approach to manage such dependencies between Runnables, so I should give up with BlockingQueue and use different strategy?
If the number of different codes is not that large, the approach with a separate single thread executor for each possible code, offered by BarrySW19, is fine.
If the whole number of threads become unacceptable, then, instead of single thread executor, we can use an actor (from Akka or another similar library):
public class WorkerActor extends UntypedActor {
public void onReceive(Object message) {
if (message instanceof Runnable) {
Runnable work = (Runnable) message;
work.run();
} else {
// report an error
}
}
}
As in the original solution, ActorRefs for WorkerActors are collected in a HashMap. When an ActorRef workerActorRef corresponding to the given code is obtained (retrieved or created), the Runnable job is submitted to execution with workerActorRef.tell(job).
If you don't want to have a dependency to the actor library, you can program WorkerActor from scratch:
public class WorkerActor implements Runnable, Executor {
Executor executor=ForkJoinPool.commonPool(); // or can by assigned in constructor
LinkedBlockingQueue<Runnable> queue = new LinkedBlockingQueu<>();
boolean running = false;
public synchronized void execute(Runnable job) {
queue.put(job);
if (!running) {
executor.execute(this); // execute this worker, not job!
running=true;
}
public void run() {
for (;;) {
Runnable work=null;
synchronized (this) {
work = queue.poll();
if (work==null) {
running = false;
return;
}
}
work.run();
}
}
}
When a WorkerActor worker corresponding to the given code is obtained (retrieved or created), the Runnable job is submitted to execution with worker.execute(job).
One alternate strategy which springs to mind is to have a separate single thread executor for each possible code. Then, when you want to submit a new Runnable you simply lookup the correct executor to use for its code and submit the job.
This may, or may not be a good solution depending on how many different codes you have. The main thing to consider would be that the number of concurrent threads running could be as high as the number of different codes you have. If you have many different codes this could be a problem.
Of course, you could use a Semaphore to restrict the number of concurrently running jobs; you would still create one thread per code, but only a limited number could actually execute at the same time. For example, this would serialise jobs by code, allowing up to three different codes to run concurrently:
public class MultiPoolExecutor {
private final Semaphore semaphore = new Semaphore(3);
private final ConcurrentMap<String, ExecutorService> serviceMap
= new ConcurrentHashMap<>();
public void submit(String code, Runnable job) {
ExecutorService executorService = serviceMap.computeIfAbsent(
code, (k) -> Executors.newSingleThreadExecutor());
executorService.submit(() -> {
semaphore.acquireUninterruptibly();
try {
job.run();
} finally {
semaphore.release();
}
});
}
}
Another approach would be to modify the Runnable to release a lock and check for jobs which could be run upon completion (so avoiding polling) - something like this example, which keeps all the jobs in a list until they can be submitted. The boolean latch ensures only one job for each code has been submitted to the thread pool at any one time. Whenever a new job arrives or a running one completes the code checks again for new jobs which can be submitted (the CodedRunnable is simply an extension of Runnable which has a code property).
public class SubmissionService {
private final ExecutorService executorService = Executors.newFixedThreadPool(5);
private final ConcurrentMap<String, AtomicBoolean> locks = new ConcurrentHashMap<>();
private final List<CodedRunnable> jobs = new ArrayList<>();
public void submit(CodedRunnable codedRunnable) {
synchronized (jobs) {
jobs.add(codedRunnable);
}
submitWaitingJobs();
}
private void submitWaitingJobs() {
synchronized (jobs) {
for(Iterator<CodedRunnable> iter = jobs.iterator(); iter.hasNext(); ) {
CodedRunnable nextJob = iter.next();
AtomicBoolean latch = locks.computeIfAbsent(
nextJob.getCode(), (k) -> new AtomicBoolean(false));
if(latch.compareAndSet(false, true)) {
iter.remove();
executorService.submit(() -> {
try {
nextJob.run();
} finally {
latch.set(false);
submitWaitingJobs();
}
});
}
}
}
}
}
The downside of this approach is that the code needs to scan through the entire list of waiting jobs after each task completes. Of course, you could make this more efficient - a completing task would actually only need to check for other jobs with the same code, so the jobs could be stored in a Map<String, List<Runnable>> structure instead to allow for faster processing.
I want to create a semaphore that prevents a certain method to be executed more than 1x at a time.
If any other thread requests access, it should wait until the semaphore is released:
private Map<String, Semaphore> map;
public void test() {
String hash; //prevent to run the long running method with the same hash concurrently
if (map.contains(hash)) {
map.get(hash).aquire(); //wait for release of the lock
callLongRunningMethod();
} else {
Semaphore s = new Semaphore(1);
map.put(hash, s);
callLongRunningMethod();
s.release(); //any number of registered threads should continue
map.remove(hash);
}
}
Question: how can I lock the semaphore with just one thread, but release it so that any number of threads can continue as soon as released?
Some clarifications:
Imagine the long running method is a transactional method. Looks into the database. If no entry is found, a heavy XML request is send and persisted to db. Also maybe further async processed might be triggered as this is supposed to be the "initial fetch" of the data. Then return the object from DB (within that method). If the DB entry had existed, it would directly return the entity.
Now if multiple threads access the long running method at the same time, all methods would fetch the heavy XML (traffic, performance), and all of them would try to persist the same object into the DB (because the long running method is transactional). Causing eg non-unique exceptions. Plus all of them triggering the optional async threads.
When all but one thread is locked, only the first is responsible for persisting the object. Then, when finished, all other threads will detect that the entry already exists in DB and just serve that object.
As far as I understand, you don't need to use Semaphore here. Instead, you should use ReentrantReadWriteLock. Additionally, the test method is not thread safe.
The sample below is the implementation of your logic using RWL
private ConcurrentMap<String, ReadWriteLock> map = null;
void test() {
String hash = null;
ReadWriteLock rwl = new ReentrantReadWriteLock(false);
ReadWriteLock lock = map.putIfAbsent(hash, rwl);
if (lock == null) {
lock = rwl;
}
if (lock.writeLock().tryLock()) {
try {
compute();
map.remove(hash);
} finally {
lock.writeLock().unlock();
}
} else {
lock.readLock().lock();
try {
compute();
} finally {
lock.readLock().unlock();
}
}
}
In this code, the first successful thread would acquire WriteLock while other Threads would wait for release of write lock. After release of a WriteLock all Threads waiting for release would proceed concurrently.
As far as I understand your need you want to be able to ensure that the task is executed by one single thread for the first time then you want to allow several threads to execute it if so you need to rely on a CountDownLatch as next:
Here is how it could be implemented with CountDownLatch:
private final ConcurrentMap<String, CountDownLatch> map = new ConcurrentHashMap<>();
public void test(String hash) {
final CountDownLatch latch = new CountDownLatch(1);
final CountDownLatch previous = map.putIfAbsent(hash, latch);
if (previous == null) {
try {
callLongRunningMethod();
} finally {
map.remove(hash, latch);
latch.countDown();
}
} else {
try {
previous.await();
callLongRunningMethod();
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
}
}
}
I think you could do that by using a very high permit number (higher than the number of threads, e.g. 2000000).
Then in the function that should run exclusively you acquire the complete number of permits (acquire(2000000)) and in the other threads you acquire only a single permit.
I think that the easiest way to do this would be using an ExecutorService and Future:
class ContainingClass {
private final ConcurrentHashMap<String, Future<?>> pending =
new ConcurrentHashMap<>();
private final ExecutorService executor;
ContainingClass(ExecutorService executor) {
this.executor = executor;
}
void test(String hash) {
Future<?> future = pending.computeIfAbsent(
hash,
() -> executor.submit(() -> longRunningMethod()));
// Exception handling omitted for clarity.
try {
future.get(); // Block until LRM has finished.
} finally {
// Always remove: in case of exception, this allows
// the value to be computed again.
pending.values().remove(future);
}
}
}
Ideone Demo
Removing the future from the values is thread safe because computeIfAbsent and remove are atomic: either the computeIfAbsent is run before the remove, in which case the existing future is returned, and is immediately complete; or it is run after, and a new future is added, resulting in a new call to longRunningMethod.
Note that it removes the future from pending.values(), not from the pending directly: consider the following example:
Thread 1 and Thread 2 are run concurrently
Thread 1 completes, and removes the value.
Thread 3 is run, adding a new future to the map
Thread 2 completes, and tries to remove the value.
If the future were removed from the map by key, Thread 2 would remove Thread 3's future, which is a different instance from Thread 2's future.
This simplifies the longRunningMethod too, since it is no longer required to do the "check if I need to do anything" for the blocked threads: that the Future.get() has completed successfully in the blocking thread is sufficient to indicate that no additional work is needed.
I ended as follows using CountDownLatch:
private final ConcurrentMap<String, CountDownLatch> map = new ConcurrentHashMap<>();
public void run() {
boolean active = false;
CountDownLatch count = null;
try {
if (map.containsKey(hash)) {
count = map.get(hash);
count.await(60, TimeUnit.SECONDS); //wait for release or timeout
} else {
count = new CountDownLatch(1);
map.put(hash, count); //block any threads with same hash
active = true;
}
return runLongRunningTask();
} finally {
if (active) {
count.countDown(); //release
map.remove(hash, count);
}
}
}
I have a method named process in two of my Classes, lets say CLASS-A and CLASS-B. Now in the below loop, I am calling process method of both of my classes sequentially meaning one by one and it works fine but that is the not the way I am looking for.
for (ModuleRegistration.ModulesHolderEntry entry : ModuleRegistration.getInstance()) {
final Map<String, String> response = entry.getPlugin().process(outputs);
// write to database
System.out.println(response);
}
Is there any way, I can call the process method of both of my classes in a multithreaded way. Meaning one thread will call process method of CLASS-A and second thread will call process method of CLASS-B.
And then after that I was thinking to write the data that is being returned by the process method into the database. So I can have one more thread for writing into database.
Below is the code that I came up with in a multithreaded way but somehow it is not running at all.
public void writeEvents(final Map<String, Object> data) {
// Three threads: one thread for the database writer, two threads for the plugin processors
final ExecutorService executor = Executors.newFixedThreadPool(3);
final BlockingQueue<Map<String, String>> queue = new LinkedBlockingQueue<Map<String, String>>();
#SuppressWarnings("unchecked")
final Map<String, String> outputs = (Map<String, String>)data.get(ModelConstants.EVENT_HOLDER);
for (final ModuleRegistration.ModulesHolderEntry entry : ModuleRegistration.getInstance()) {
executor.submit(new Runnable () {
public void run() {
final Map<String, String> response = entry.getPlugin().process(outputs);
// put the response map in the queue for the database to read
queue.offer(response);
}
});
}
Future<?> future = executor.submit(new Runnable () {
public void run() {
Map<String, String> map;
try {
while(true) {
// blocks until a map is available in the queue, or until interrupted
map = queue.take();
// write map to database
System.out.println(map);
}
} catch (InterruptedException ex) {
// IF we're catching InterruptedException then this means that future.cancel(true)
// was called, which means that the plugin processors are finished;
// process the rest of the queue and then exit
while((map = queue.poll()) != null) {
// write map to database
System.out.println(map);
}
}
}
});
// this interrupts the database thread, which sends it into its catch block
// where it processes the rest of the queue and exits
future.cancel(true); // interrupt database thread
// wait for the threads to finish
try {
executor.awaitTermination(5, TimeUnit.MINUTES);
} catch (InterruptedException e) {
//log error here
}
}
But If I remove the last line executor.awaitTermination(5, TimeUnit.MINUTES); then it start running fine and after some time, I always get error like this-
JVMDUMP006I Processing dump event "systhrow", detail "java/lang/OutOfMemoryError" - please wait.
JVMDUMP032I JVM requested Heap dump using 'S:\GitViews\Stream\goldseye\heapdump.20130827.142415.16456.0001.phd' in response to an event
JVMDUMP010I Heap dump written to S:\GitViews\Stream\goldseye\heapdump.20130827.142415.16456.0001.phd
JVMDUMP006I Processing dump event "systhrow", detail "java/lang/OutOfMemoryError" - please wait.
Can anybody help me in figuring out what's the problem and what wrong I am doing in my above code? if I am running sequentially then I don't get any errors and it works fine.
And also is there any better way of doing this as compared to the way I am doing? Because in future I can have multiple plugin processor as compared to two.
What I am trying to do is- Call the process method of both of my classes in a multithreaded way and then write into the database bcoz my process method will return back a Map.
Any help will be appreciated on this.. And I am looking for a workable example on this if possible. Thanks for the help,
The code snippet you pasted has few issues, if you fix them, this should work.
1. You are using an infinite loop to fetch element from the blocking queue and trying to break this using future. This is definitely not a good approach. The problem with this approach is it is possible that your database thread would never run because it could be cancelled by the future task running in the caller thread even before it runs. This is error-prone.
- You should run the while loop fixed number of times (you already know how many producers are there or how many times you are going to get the response).
Also, tasks submitted to executor service should be independent tasks...here your database task is dependent on the execution of other tasks..this can also lead to deadlock if your execution policy changes..for example if you use single thread pool executor and if database thread is scheduled it would just block waiting for producers to add data in the queue.
A good way is to create task that retrieves data and update the database in the same thread.
Or retrieve all the responses first and then execute database operations in parallel
public void writeEvents(final Map data) {
final ExecutorService executor = Executors.newFixedThreadPool(3);
#SuppressWarnings("unchecked")
final Map<String, String> outputs = (Map<String, String>)data.get(ModelConstants.EVENT_HOLDER);
for (final ModuleRegistration.ModulesHolderEntry entry : ModuleRegistration.getInstance()) {
executor.submit(new Runnable () {
public void run() {
try {
final Map<String, String> response = entry.getPlugin().process(outputs);
//process the response and update database.
System.out.println(map);
} catch (Throwable e) {
//handle execption
} finally {
//clean up resources
}
}
});
}
// This will wait for running threads to complete ..it's an orderly shutdown.
executor.shutdown();
}
OK, here's some code for the comments I suggested above. Disclaimer: I'm not sure whether it works or even compiles, or whether it solves the problem. But the idea is to take control of the cancellation process instead of relying on future.cancel which I suspect could cause problems.
class CheckQueue implements Runnable {
private volatile boolean cancelled = false;
public void cancel() { cancelled = true; }
public void run() {
Map<String, String> map;
try {
while(!cancelled) {
// blocks until a map is available in the queue, or until interrupted
map = queue.take();
if (cancelled) break;
// write map to database
System.out.println(map);
} catch (InterruptedException e) {
}
while((map = queue.poll()) != null) {
// write map to database
System.out.println(map);
}
}
}
CheckQueue queueChecker = new CheckQueue ();
Future<?> future = executor.submit(queueChecker);
// this interrupts the database thread, which sends it into its catch block
// where it processes the rest of the queue and exits
queueChecker.cancel();
I have an BlockingQueue<Runnable>(taken from ScheduledThreadPoolExecutor) in producer-consumer environment. There is one thread adding tasks to the queue, and a thread pool executing them.
I need notifications on two events:
First item added to empty queue
Last item removed from queue
Notification = writing a message to database.
Is there any sensible way to implement that?
A simple and naïve approach would be to decorate your BlockingQueue with an implementation that simply checks the underlying queue and then posts a task to do the notification.
NotifyingQueue<T> extends ForwardingBlockingQueue<T> implements BlockingQueue<T> {
private final Notifier notifier; // injected not null
…
#Override public void put(T element) {
if (getDelegate().isEmpty()) {
notifier.notEmptyAnymore();
}
super.put(element);
}
#Override public T poll() {
final T result = super.poll();
if ((result != null) && getDelegate().isEmpty())
notifier.nowEmpty();
}
… etc
}
This approach though has a couple of problems. While the empty -> notEmpty is pretty straightforward – particularly for a single producer case, it would be easy for two consumers to run concurrently and both see the queue go from non-empty -> empty.
If though, all you want is to be notified that the queue became empty at some time, then this will be enough as long as your notifier is your state machine, tracking emptiness and non-emptiness and notifying when it changes from one to the other:
AtomicStateNotifier implements Notifier {
private final AtomicBoolean empty = new AtomicBoolean(true); // assume it starts empty
private final Notifier delegate; // injected not null
public void notEmptyAnymore() {
if (empty.get() && empty.compareAndSet(true, false))
delegate.notEmptyAnymore();
}
public void nowEmpty() {
if (!empty.get() && empty.compareAndSet(false, true))
delegate.nowEmpty();
}
}
This is now a thread-safe guard around an actual Notifier implementation that perhaps posts tasks to an Executor to asynchronously write the events to the database.
The design is most likely flawed but you can do it relatively simple:
You have a single thread adding, so you can check before adding. i.e. pool.getQueue().isEmpty() - w/ one producer, this is safe.
Last item removed cannot be guaranteed but you can override beforeExecute and check the queue again. Possibly w/ a small timeout after isEmpty() returns true. Probably the code below will be better off executed in afterExecute instead.
protected void beforeExecute(Thread t, Runnable r) {
if (getQueue().isEmpty()){
try{
Runnable r = getQueue().poll(200, TimeUnit.MILLISECONDS);
if (r!=null){
execute(r);
} else{
//last message - or on after execute by Setting a threadLocal and check it there
//alternatively you may need to do so ONLY in after execute, depending on your needs
}
}catch(InterruptedException _ie){
Thread.currentThread().interrupt();
}
}
}
sometime like that
I can explain why doing notifications w/ the queue itself won't work well: imagine you add a task to be executed by the pool, the task is scheduled immediately, the queue is empty again and you will need notification.
I like the ExecutorService series of classes/interfaces. I don't have to worry about threads; I take in an ExecutorService instance and use it to schedule tasks, and if I want to use an 8-thread or 16-thread pool, well, great, I don't have to worry about that at all, it just happens depending on how the ExecutorService is setup. Hurray!
But what do I do if some of my tasks need to be executed in serial order? Ideally I would ask the ExecutorService to let me schedule these tasks on a single thread, but there doesn't seem to be any means of doing so.
edit: The tasks are not known ahead of time, they are an unlimited series of tasks that are erratically generated by events of various kinds (think random / unknown arrival process: e.g. clicks of a Geiger counter, or keystroke events).
You could write an implementation of Runnable that takes some tasks and executes them serially.
Something like:
public class SerialRunner implements Runnable {
private List<Runnable> tasks;
public SerialRunner(List<Runnable> tasks) {
this.tasks = tasks;
}
public void run() {
for (Runnable task: tasks) {
task.run();
}
}
}
I'm using a separate executor created with Executors.newSingleThreadExecutor() for tasks that I want to queue up and only run one at a time.
Another approach is to just compose several tasks and submit that one,
executor.submit(new Runnable() {
public void run() {
myTask1.call();
myTask2.call();
myTask3.call();
}});
Though you might need to be more elaborate if still want myTask2 to run even if myTask1 throws an Exception.
The way I do this is via some homegrown code that streams work onto different threads according what the task says its key is (this can be completely arbitrary or a meaningful value). Instead of offering to a Queue and having some other thread(s) taking work off it (or lodging work with the ExecutorService in your case and having the service maintain a threadpool that takes off the internal work queues), you offer a Pipelineable (aka a task) to the PipelineManager which locates the right queue for the key of that task and sticks the task onto that queue. There is assorted other code that manages the threads taking off the queues to ensure you always have 1 and only 1 thread taking off that queue in order to guarantee that all work offered to it for the same key will be executed serially.
Using this approach you could easily set aside certain keys for n sets of serial work while round robining over the remaining keys for the work that can go in any old order or alternatively you can keep certain pipes (threads) hot by judicious key selection.
This approach is not feasible for the JDK ExecutorService implementation because they're backed by a single BlockingQueue (at least a ThreadPoolExecutor is) and hence there's no way to say "do this work in any old order but this work must be serialised". I am assuming you want that of course in order to maintain throughput otherwise just stick everything onto a singleThreadExecutor as per danben's comment.
(edit)
What you could do instead, to maintain the same abstraction, is create create your own implementation of ExecutorService that delegates to as many instances of ThreadPoolExecutor (or similar) as you need; 1 backed by n threads and 1 or more single threaded instances. Something like the following (which in no way at all is working code but hopefully you get the idea!)
public class PipeliningExecutorService<T extends Pipelineable> implements ExecutorService {
private Map<Key, ExecutorService> executors;
private ExecutorService generalPurposeExecutor;
// ExecutorService methods here, for example
#Override
public <T> Future<T> submit(Callable<T> task) {
Pipelineable pipelineableTask = convertTaskToPipelineable(task);
Key taskKey = pipelineable.getKey();
ExecutorService delegatedService = executors.get(taskKey);
if (delegatedService == null) delegatedService = generalPurposeExecutor;
return delegatedService.submit(task);
}
}
public interface Pipelineable<K,V> {
K getKey();
V getValue();
}
It's pretty ugly, for this purpose, that the ExecutorService methods are generic as opposed to the service itself which means you need some standard way to marshal whatever gets passed in into a Pipelineable and a fallback if you can't (e.g. throw it onto the general purpose pool).
hmm, I thought of something, not quite sure if this will work, but maybe it will (untested code). This skips over subtleties (exception handling, cancellation, fairness to other tasks of the underlying Executor, etc.) but is maybe useful.
class SequentialExecutorWrapper implements Runnable
{
final private ExecutorService executor;
// queue of tasks to execute in sequence
final private Queue<Runnable> taskQueue = new ConcurrentLinkedQueue<Runnable>();
// semaphore for pop() access to the task list
final private AtomicBoolean taskInProcess = new AtomicBoolean(false);
public void submit(Runnable task)
{
// add task to the queue, try to run it now
taskQueue.offer(task);
if (!tryToRunNow())
{
// this object is running tasks on another thread
// do we need to try again or will the currently-running thread
// handle it? (depends on ordering between taskQueue.offer()
// and the tryToRunNow(), not sure if there is a problem)
}
}
public void run()
{
tryToRunNow();
}
private boolean tryToRunNow()
{
if (taskInProcess.compareAndSet(false, true))
{
// yay! I own the task queue!
try {
Runnable task = taskQueue.poll();
while (task != null)
{
task.run();
task = taskQueue.poll();
}
}
finally
{
taskInProcess.set(false);
}
return true;
}
else
{
return false;
}
}