I want to create a semaphore that prevents a certain method to be executed more than 1x at a time.
If any other thread requests access, it should wait until the semaphore is released:
private Map<String, Semaphore> map;
public void test() {
String hash; //prevent to run the long running method with the same hash concurrently
if (map.contains(hash)) {
map.get(hash).aquire(); //wait for release of the lock
callLongRunningMethod();
} else {
Semaphore s = new Semaphore(1);
map.put(hash, s);
callLongRunningMethod();
s.release(); //any number of registered threads should continue
map.remove(hash);
}
}
Question: how can I lock the semaphore with just one thread, but release it so that any number of threads can continue as soon as released?
Some clarifications:
Imagine the long running method is a transactional method. Looks into the database. If no entry is found, a heavy XML request is send and persisted to db. Also maybe further async processed might be triggered as this is supposed to be the "initial fetch" of the data. Then return the object from DB (within that method). If the DB entry had existed, it would directly return the entity.
Now if multiple threads access the long running method at the same time, all methods would fetch the heavy XML (traffic, performance), and all of them would try to persist the same object into the DB (because the long running method is transactional). Causing eg non-unique exceptions. Plus all of them triggering the optional async threads.
When all but one thread is locked, only the first is responsible for persisting the object. Then, when finished, all other threads will detect that the entry already exists in DB and just serve that object.
As far as I understand, you don't need to use Semaphore here. Instead, you should use ReentrantReadWriteLock. Additionally, the test method is not thread safe.
The sample below is the implementation of your logic using RWL
private ConcurrentMap<String, ReadWriteLock> map = null;
void test() {
String hash = null;
ReadWriteLock rwl = new ReentrantReadWriteLock(false);
ReadWriteLock lock = map.putIfAbsent(hash, rwl);
if (lock == null) {
lock = rwl;
}
if (lock.writeLock().tryLock()) {
try {
compute();
map.remove(hash);
} finally {
lock.writeLock().unlock();
}
} else {
lock.readLock().lock();
try {
compute();
} finally {
lock.readLock().unlock();
}
}
}
In this code, the first successful thread would acquire WriteLock while other Threads would wait for release of write lock. After release of a WriteLock all Threads waiting for release would proceed concurrently.
As far as I understand your need you want to be able to ensure that the task is executed by one single thread for the first time then you want to allow several threads to execute it if so you need to rely on a CountDownLatch as next:
Here is how it could be implemented with CountDownLatch:
private final ConcurrentMap<String, CountDownLatch> map = new ConcurrentHashMap<>();
public void test(String hash) {
final CountDownLatch latch = new CountDownLatch(1);
final CountDownLatch previous = map.putIfAbsent(hash, latch);
if (previous == null) {
try {
callLongRunningMethod();
} finally {
map.remove(hash, latch);
latch.countDown();
}
} else {
try {
previous.await();
callLongRunningMethod();
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
}
}
}
I think you could do that by using a very high permit number (higher than the number of threads, e.g. 2000000).
Then in the function that should run exclusively you acquire the complete number of permits (acquire(2000000)) and in the other threads you acquire only a single permit.
I think that the easiest way to do this would be using an ExecutorService and Future:
class ContainingClass {
private final ConcurrentHashMap<String, Future<?>> pending =
new ConcurrentHashMap<>();
private final ExecutorService executor;
ContainingClass(ExecutorService executor) {
this.executor = executor;
}
void test(String hash) {
Future<?> future = pending.computeIfAbsent(
hash,
() -> executor.submit(() -> longRunningMethod()));
// Exception handling omitted for clarity.
try {
future.get(); // Block until LRM has finished.
} finally {
// Always remove: in case of exception, this allows
// the value to be computed again.
pending.values().remove(future);
}
}
}
Ideone Demo
Removing the future from the values is thread safe because computeIfAbsent and remove are atomic: either the computeIfAbsent is run before the remove, in which case the existing future is returned, and is immediately complete; or it is run after, and a new future is added, resulting in a new call to longRunningMethod.
Note that it removes the future from pending.values(), not from the pending directly: consider the following example:
Thread 1 and Thread 2 are run concurrently
Thread 1 completes, and removes the value.
Thread 3 is run, adding a new future to the map
Thread 2 completes, and tries to remove the value.
If the future were removed from the map by key, Thread 2 would remove Thread 3's future, which is a different instance from Thread 2's future.
This simplifies the longRunningMethod too, since it is no longer required to do the "check if I need to do anything" for the blocked threads: that the Future.get() has completed successfully in the blocking thread is sufficient to indicate that no additional work is needed.
I ended as follows using CountDownLatch:
private final ConcurrentMap<String, CountDownLatch> map = new ConcurrentHashMap<>();
public void run() {
boolean active = false;
CountDownLatch count = null;
try {
if (map.containsKey(hash)) {
count = map.get(hash);
count.await(60, TimeUnit.SECONDS); //wait for release or timeout
} else {
count = new CountDownLatch(1);
map.put(hash, count); //block any threads with same hash
active = true;
}
return runLongRunningTask();
} finally {
if (active) {
count.countDown(); //release
map.remove(hash, count);
}
}
}
Related
I have to manage scheduled file replications in a system. The file replications are scheduled by users and I need to restrict the amount of system resources used during replication. The amount of time that each replication may take is not defined (i.e. a replication may be scheduled to run every 15 minutes and the previous run may still be running when the next run is due) and a replication should not be queued if it's already queued or running.
I have a scheduler that periodically checks for due file replications and, for each one, (1) add it to a blocking queue if it is not queued nor running or (2) drop it otherwise.
private final Object scheduledReplicationsLock = new Object();
private final BlockingQueue<Replication> replicationQueue = new LinkedBlockingQueue<>();
private final Set<Long> queuedReplicationIds = new HashSet<>();
private final Set<Long> runningReplicationIds = new HashSet<>();
public boolean add(Replication replication) {
synchronized (scheduledReplicationsLock) {
// If the replication job is either still executing or is already queued, do not add it.
if (queuedReplicationIds.contains(replication.id) || runningReplicationIds.contains(replication.id)) {
return false;
}
replicationQueue.add(replication)
queuedReplicationIds.add(replication.id);
}
I also have a pool of threads that waits until there is a replication in the queue and executes it. Below is the main method of each thread in the thread pool:
public void run() {
while (True) {
Replication replication = null;
synchronized (scheduledReplicationsLock) {
// This will block until a replication job is ready to be run or the current thread is interrupted.
replication = replicationQueue.take();
// Move the ID value out of the queued set and into the active set
Long replicationId = replication.getId();
queuedReplicationIds.remove(replicationId);
runningReplicationIds.add(replicationId);
}
executeReplication(replication)
}
}
This code gets into a deadlock because the first thread in the thread poll will get scheduledLock and prevent the scheduler to add replications to the queue. Moving replicationQueue.take() out of the synchronized block will eliminate the deadlock, but then it's possible that a element is removed from the queue and the hash sets are not atomically updated with it, which could cause a replication to be incorrectly dropped.
Should I use BlockingQueue.poll() and release the lock + sleep if the queue is empty instead of using BlockingQueue.take() ?
Fixes to the current solution or other solutions that meet the requirements are welcome.
wait / notify
Keeping your same control flow, instead of blocking on the BlockingQueue instance while holding the mutex lock, you can wait on notifications for the scheduledReplicationsLock forcing the worker thread to release the lock and return to the waiting pool.
Here down a reduced sample of your producer:
private final List<Replication> replicationQueue = new LinkedList<>();
private final Set<Long> runningReplicationIds = new HashSet<>();
public boolean add(Replication replication) {
synchronized (replicationQueue) {
// If the replication job is either still executing or is already queued, do not add it.
if (replicationQueue.contains(replication) || runningReplicationIds.contains(replication.id)) {
return false;
} else {
replicationQueue.add(replication);
replicationQueue.notifyAll();
}
}
}
The worker Runnable would then be updated as follows:
public void run() {
synchronized (replicationQueue) {
while (true) {
if (replicationQueue.isEmpty()) {
scheduledReplicationsLock.wait();
}
if (!replicationQueue.isEmpty()) {
Replication replication = replicationQueue.poll();
runningReplicationIds.add(replication.getId())
executeReplication(replication);
}
}
}
}
BlockingQueue
Generally you are better off using the BlockingQueue to coordinate your producer and replicating worker pool.
The BlockingQueue is, as the name implies, blocking by nature and will cause the calling thread to block only if items cannot be pulled / pushed from / to the queue.
Meanwhile, note that you will have to update your running / enqueued state management as you will only synchronizing on the BlockingQueue items dropping any constraints. This then will depend on the context, whether this would be acceptable or not.
This way, you would drop all other used mutex(es) and use on the BlockingQueue as your synchronization state:
private final BlockingQueue<Replication> replicationQueue = new LinkedBlockingQueue<>();
public boolean add(Replication replication) {
// not sure if this is the proper invariant to check as at some point the replication would be neither queued nor running while still have been processed
if (replicationQueue.contains(replication)) {
return false;
}
// use `put` instead of `add` as this will block waiting for free space
replicationQueue.put(replication);
return true;
}
The workers will then take indefinitely from the BlockingQueue:
public void run() {
while (true) {
Replication replication = replicationQueue.take();
executeReplication(replication);
}
}
You no need to use any additional synchronization block if you using BlockingQueue
Quote from docs (https://docs.oracle.com/javase/7/docs/api/java/util/concurrent/BlockingQueue.html)
BlockingQueue implementations are thread-safe. All queuing methods achieve their effects atomically using internal locks or other forms of concurrency control.
just use something like this
public void run() {
try {
while (replicationQueue.take()) { //Thread will be wait for the next element in the queue
Long replicationId = replication.getId();
queuedReplicationIds.remove(replicationId);
runningReplicationIds.add(replicationId);
executeReplication(replication);
}
} catch (InterruptedException ex) {
//if interrupted while waiting next element
}
}
}
look in javadoc https://docs.oracle.com/javase/7/docs/api/java/util/concurrent/LinkedBlockingQueue.html#take()
Or you can use BlockinQueue.pool() with timeout settings
UPD: After discussion, I extend LinkedBlockingQueue with two ConcurrentHashSets and add method afterTake() to remove processed Replicas. You do not need an additional synchronizations outside the queue. Just put replica in the first thread and take it in another, and call afterTake() when replication finished. You need to override other method if you want to use it.
package ru.everytag;
import io.vertx.core.impl.ConcurrentHashSet;
import java.util.concurrent.LinkedBlockingQueue;
public class TwoPhaseBlockingQueue<E> extends LinkedBlockingQueue<E> {
private ConcurrentHashSet<E> items = new ConcurrentHashSet<>();
private ConcurrentHashSet<E> taken = new ConcurrentHashSet<>();
#Override
public void put(E e) throws InterruptedException {
if (!items.contains(e)) {
items.add(e);
super.put(e);
}
}
public E take() {
E item = take();
taken.add(item);
items.remove(item);
return item;
}
public void afterTake(E e) {
if (taken.contains(e)) {
taken.remove(e);
} else if (items.contains(e)) {
throw new IllegalArgumentException("Element still in the queue");
}
}
}
In the system, I have an object - let's call it TaskProcessor. It holds queue of tasks, which are executed by some pool of threads (ExecutorService + PriorityBlockingQueue)
The result of each task is saved in the database under some unique identifier.
The user, who knows this unique identifier, may check the result of this task. The result could be in the database, but also the task could still wait in the queue for execution. In that case, UserThread should wait until the task will be finished.
Additionally, the following assumptions are valid:
Someone else could enqueue the task to TaskProcessor and some random UserThread can access the result if he knows the unique identifier.
UserThread and TaskProcess are in the same app. TaskProcessor contains a pool of threads, and UserThread is simply servlet Thread.
UserThread should be blocked when asking for the result, and the result is not completed yet. UserThread should be unblocked immediately after TaskProcessor complete task (or tasks) grouped by a unique identifier
My first attempt (the naive one), was to check the result in the loop and sleep for some time:
// UserThread
while(!checkResultIsInDatabase(uniqueIdentifier))
sleep(someTime)
But I don't like it. First of all, I am wasting database connections. Moreover, if the task would be finished right after sleep, then the user will wait even if the result just appeared.
Next attempt was based on wait/notify:
//UserThread
while (!checkResultIsInDatabase())
taskProcessor.wait()
//TaskProcessor
... some complicated calculations
this.notifyAll()
But I don't like it either. If more UserThreads will use TaskProcessor, then they will be wakened up unnecessarily every time some task would be completed and moreover - they will make unnecessary database calls.
The last attempt was based on something which I called waitingRoom:
//UserThread
Object mutex = new Object();
taskProcessor.addToWaitingRoom(uniqueIdentifier, mutex)
while (!checkResultIsInDatabase())
mutex.wait()
//TaskProcessor
... Some complicated calculations
if (uniqueIdentifierExistInWaitingRoom(taskUniqueIdentifier))
getMutexFromWaitingRoom(taskUniqueIdentifier).notify()
But it seems to be not secure. Between database check and wait(), the task could be completed (notify() wouldn't be effective because UserThread didn't invoke wait() yet), which may end up with deadlock.
It seems, that I should synchronize it somewhere. But I am afraid that it will be not effective.
Is there a way to correct any of my attempts, to make them secure and effective? Or maybe there is some other, better way to do this?
You seem to be looking for some sort of future / promise abstraction. Take a look at CompletableFuture, available since Java 8.
CompletableFuture<Void> future = CompletableFuture.runAsync(db::yourExpensiveOperation, executor);
// best approach: attach some callback to run when the future is complete, and handle any errors
future.thenRun(this::onSuccess)
.exceptionally(ex -> logger.error("err", ex));
// if you really need the current thread to block, waiting for the async result:
future.join(); // blocking! returns the result when complete or throws a CompletionException on error
You can also return a (meaningful) value from your async operation and pass the result to the callback. To make use of this, take a look at supplyAsync(), thenAccept(), thenApply(), whenComplete() and the like.
You can also combine multiple futures into one and a lot more.
I believe replacing of mutex with CountDownLatch in waitingRoom approach prevents deadlock.
CountDownLatch latch = new CountDownLatch(1)
taskProcessor.addToWaitingRoom(uniqueIdentifier, latch)
while (!checkResultIsInDatabase())
// consider timed version
latch.await()
//TaskProcessor
... Some complicated calculations
if (uniqueIdentifierExistInWaitingRoom(taskUniqueIdentifier))
getLatchFromWaitingRoom(taskUniqueIdentifier).countDown()
With CompletableFuture and a ConcurrentHashMap you can achieve it:
/* Server class, i.e. your TaskProcessor */
// Map of queued tasks (either pending or ongoing)
private static final ConcurrentHashMap<String, CompletableFuture<YourTaskResult>> tasks = new ConcurrentHashMap<>();
// Launch method. By default, CompletableFuture uses ForkJoinPool which implicitly enqueues tasks.
private CompletableFuture<YourTaskResult> launchTask(final String taskId) {
return tasks.computeIfAbsent(taskId, v -> CompletableFuture // return ongoing task if any, or launch a new one
.supplyAsync(() ->
doYourThing(taskId)) // get from DB or calculate or whatever
.whenCompleteAsync((integer, throwable) -> {
if (throwable != null) {
log.error("Failed task: {}", taskId, throwable);
}
tasks.remove(taskId);
})
);
/* Client class, i.e. your UserThread */
// Usage
YourTaskResult taskResult = taskProcessor.launchTask(taskId).get(); // block until we get a result
Any time a user asks for the result of a taskId, they will either:
enqueue a new task if they are the first to ask for this taskId; or
get the result of the ongoing task with id taskId, if someone else enqueued it first.
This is production code currently used by hundreds of users concurrently.
In our app, users ask for any given file, via a REST endpoint (every user on its own thread). Our taskIds are filenames, and our doYourThing(taskId) retrieves the file from the local filesystem or downloads it from an S3 bucket if it doesn't exist.
Obviously we don't want to download the same file more than once. With this solution I implemented, any number of users can ask for the same file at the same or different times, and the file will be downloaded exactly once. All users that asked for it while it was downloading will get it at the same time the moment it finishes downloading; all users that ask for it later, will get it instantly from the local filesystem.
Works like a charm.
What I understood from the question details is-
When UserThread requests for result, there are 3 possibilities:
Task has been already completed so no blocking of user thread and directly get result from DB.
Task is in queue or executing but not yet completed, so block the user thread(till now there should not be any db queries) and just after completion of task(the task result must be saved in DB at this point), unblock user thread(now user thread can query the DB for result)
There is no task submitted ever for the given uniqueIdentifier which user has requested, in this case there will be empty result from db.
For point 1 and 3, Its straight forward, there will not be any blocking of UserThread, just query the result from DB.
For point 2 - I have written a simple implementation of TaskProcessor. Here I have used ConcurrentHashMap to keep the current tasks which are not yet completed. This map contains the mapping between UniqueIdentifier and corresponding task. I have used computeIfPresent() (introduced in JAVA - 1.8) method of ConcurrentHashMap which guarantees that the invocation of this method is thread safe for the same key. Below is what java doc says:
Link
If the value for the specified key is present, attempts to compute a
new mapping given the key and its current mapped value. The entire
method invocation is performed atomically. Some attempted update
operations on this map by other threads may be blocked while
computation is in progress, so the computation should be short and
simple, and must not attempt to update any other mappings of this map.
So with use of this method, whenever there is a user thread request for a task T1 and if the task T1 is in queue or executing but not completed yet, then user thread will wait on that task.
When the task T1 will be completed, all the user requests thread which were waiting on task T1 will be notified and then we will remove task T1 from the above map.
Other classes reference used in below code are present on this link.
TaskProcessor.java:
import java.util.Map;
import java.util.concurrent.*;
import java.util.concurrent.atomic.AtomicBoolean;
import java.util.function.BiFunction;
public class TaskProcessor implements ITaskProcessor {
//This map will contain all the tasks which are in queue and not yet completed
//If there is scenario where there may be multiple tasks corresponding to same uniqueIdentifier, in that case below map can be modified accordingly to have the list of corresponding tasks which are not completed yet
private final Map<String, Task> taskInProgresssByUniqueIdentifierMap = new ConcurrentHashMap<>();
private final int QUEUE_SIZE = 100;
private final BlockingQueue<Task> taskQueue = new ArrayBlockingQueue<Task>(QUEUE_SIZE);
private final TaskRunner taskRunner = new TaskRunner();
private Executor executor;
private AtomicBoolean isStarted;
private final DBManager dbManager = new DBManager();
#Override
public void start() {
executor = Executors.newCachedThreadPool();
while(isStarted.get()) {
try {
Task task = taskQueue.take();
executeTaskInSeperateThread(task);
} catch (InterruptedException e) {
e.printStackTrace();
}
}
}
private void executeTaskInSeperateThread(Task task) {
executor.execute(() -> {
taskRunner.execute(task, new ITaskProgressListener() {
#Override
public void onTaskCompletion(TaskResult taskResult) {
task.setCompleted(true);
//TODO: we can also propagate the taskResult to waiting users, Implement it if it is required.
notifyAllWaitingUsers(task);
}
#Override
public void onTaskFailure(Exception e) {
notifyAllWaitingUsers(task);
}
});
});
}
private void notifyAllWaitingUsers(Task task) {
taskInProgresssByUniqueIdentifierMap.computeIfPresent(task.getUniqueIdentifier(), new BiFunction<String, Task, Task>() {
#Override
public Task apply(String s, Task task) {
synchronized (task) {
task.notifyAll();
}
return null;
}
});
}
//User thread
#Override
public ITaskResult getTaskResult(String uniqueIdentifier) {
TaskResult result = null;
Task task = taskInProgresssByUniqueIdentifierMap.computeIfPresent(uniqueIdentifier, new BiFunction<String, Task, Task>() {
#Override
public Task apply(String s, Task task) {
synchronized (task) {
try {
//
task.wait();
} catch (InterruptedException e) {
e.printStackTrace();
}
}
return task;
}
});
//If task is null, it means the task was not there in queue, so we direcltly query the db for the task result
if(task != null && !task.isCompleted()) {
return null; // Handle this condition gracefully, If task is not completed, it means there was some exception
}
ITaskResult taskResult = getResultFromDB(uniqueIdentifier); // At this point the result must be already saved in DB if the corresponding task has been processed ever.
return taskResult;
}
private ITaskResult getResultFromDB(String uniqueIdentifier) {
return dbManager.getTaskResult(uniqueIdentifier);
}
//Other thread
#Override
public void enqueueTask(Task task) {
if(isStarted.get()) {
taskInProgresssByUniqueIdentifierMap.putIfAbsent(task.getUniqueIdentifier(), task);
taskQueue.offer(task);
}
}
#Override
public void stop() {
isStarted.compareAndSet(true, false);
}
}
Let me know in comments if you have any queries.
Thanks.
I have a BlockingQueue of Runnable - I can simply execute all tasks using one of TaskExecutor implementations, and all will be run in parallel.
However some Runnable depends on others, it means they need to wait when Runnable finish, then they can be executed.
Rule is quite simple: every Runnable has a code. Two Runnable with the same code cannot be run simultanously, but if the code differ they should be run in parallel.
In other words all running Runnable need to have different code, all "duplicates" should wait.
The problem is that there's no event/method/whatsoever when thread ends.
I can built such notification into every Runnable, but I don't like this approach, because it will be done just before thread ends, not after it's ended
java.util.concurrent.ThreadPoolExecutor has method afterExecute, but it needs to be implemented - Spring use only default implementation, and this method is ignored.
Even if I do that, it's getting complicated, because I need to track two additional collections: with Runnables already executing (no implementation gives access to this information) and with those postponed because they have duplicated code.
I like the BlockingQueue approach because there's no polling, thread simply activate when something new is in the queue. But maybe there's a better approach to manage such dependencies between Runnables, so I should give up with BlockingQueue and use different strategy?
If the number of different codes is not that large, the approach with a separate single thread executor for each possible code, offered by BarrySW19, is fine.
If the whole number of threads become unacceptable, then, instead of single thread executor, we can use an actor (from Akka or another similar library):
public class WorkerActor extends UntypedActor {
public void onReceive(Object message) {
if (message instanceof Runnable) {
Runnable work = (Runnable) message;
work.run();
} else {
// report an error
}
}
}
As in the original solution, ActorRefs for WorkerActors are collected in a HashMap. When an ActorRef workerActorRef corresponding to the given code is obtained (retrieved or created), the Runnable job is submitted to execution with workerActorRef.tell(job).
If you don't want to have a dependency to the actor library, you can program WorkerActor from scratch:
public class WorkerActor implements Runnable, Executor {
Executor executor=ForkJoinPool.commonPool(); // or can by assigned in constructor
LinkedBlockingQueue<Runnable> queue = new LinkedBlockingQueu<>();
boolean running = false;
public synchronized void execute(Runnable job) {
queue.put(job);
if (!running) {
executor.execute(this); // execute this worker, not job!
running=true;
}
public void run() {
for (;;) {
Runnable work=null;
synchronized (this) {
work = queue.poll();
if (work==null) {
running = false;
return;
}
}
work.run();
}
}
}
When a WorkerActor worker corresponding to the given code is obtained (retrieved or created), the Runnable job is submitted to execution with worker.execute(job).
One alternate strategy which springs to mind is to have a separate single thread executor for each possible code. Then, when you want to submit a new Runnable you simply lookup the correct executor to use for its code and submit the job.
This may, or may not be a good solution depending on how many different codes you have. The main thing to consider would be that the number of concurrent threads running could be as high as the number of different codes you have. If you have many different codes this could be a problem.
Of course, you could use a Semaphore to restrict the number of concurrently running jobs; you would still create one thread per code, but only a limited number could actually execute at the same time. For example, this would serialise jobs by code, allowing up to three different codes to run concurrently:
public class MultiPoolExecutor {
private final Semaphore semaphore = new Semaphore(3);
private final ConcurrentMap<String, ExecutorService> serviceMap
= new ConcurrentHashMap<>();
public void submit(String code, Runnable job) {
ExecutorService executorService = serviceMap.computeIfAbsent(
code, (k) -> Executors.newSingleThreadExecutor());
executorService.submit(() -> {
semaphore.acquireUninterruptibly();
try {
job.run();
} finally {
semaphore.release();
}
});
}
}
Another approach would be to modify the Runnable to release a lock and check for jobs which could be run upon completion (so avoiding polling) - something like this example, which keeps all the jobs in a list until they can be submitted. The boolean latch ensures only one job for each code has been submitted to the thread pool at any one time. Whenever a new job arrives or a running one completes the code checks again for new jobs which can be submitted (the CodedRunnable is simply an extension of Runnable which has a code property).
public class SubmissionService {
private final ExecutorService executorService = Executors.newFixedThreadPool(5);
private final ConcurrentMap<String, AtomicBoolean> locks = new ConcurrentHashMap<>();
private final List<CodedRunnable> jobs = new ArrayList<>();
public void submit(CodedRunnable codedRunnable) {
synchronized (jobs) {
jobs.add(codedRunnable);
}
submitWaitingJobs();
}
private void submitWaitingJobs() {
synchronized (jobs) {
for(Iterator<CodedRunnable> iter = jobs.iterator(); iter.hasNext(); ) {
CodedRunnable nextJob = iter.next();
AtomicBoolean latch = locks.computeIfAbsent(
nextJob.getCode(), (k) -> new AtomicBoolean(false));
if(latch.compareAndSet(false, true)) {
iter.remove();
executorService.submit(() -> {
try {
nextJob.run();
} finally {
latch.set(false);
submitWaitingJobs();
}
});
}
}
}
}
}
The downside of this approach is that the code needs to scan through the entire list of waiting jobs after each task completes. Of course, you could make this more efficient - a completing task would actually only need to check for other jobs with the same code, so the jobs could be stored in a Map<String, List<Runnable>> structure instead to allow for faster processing.
I have a java class to handle a multithreaded subscription service. By implementing the Subscribable interface, tasks can be submitted to the service and periodically executed. A sketch of the code is seen below:
import java.util.concurrent.*;
public class Subscribtions {
private ConcurrentMap<Subscribable, Future<?>> futures = new ConcurrentHashMap<Subscribable, Future<?>>();
private ConcurrentMap<Subscribable, Integer> cacheFutures = new ConcurrentHashMap<Subscribable, Integer>();
private ScheduledExecutorService threads;
public Subscribtions() {
threads = Executors.newScheduledThreadPool(16);
}
public void subscribe(Subscribable subscription) {
Runnable runnable = getThread(subscription);
Future<?> future = threads.scheduleAtFixedRate(runnable, subscription.getInitialDelay(), subscription.getPeriod(), TimeUnit.SECONDS);
futures.put(subscription, future);
}
/*
* Only called from controller thread
*/
public void unsubscribe(Subscribable subscription) {
Future<?> future = futures.remove(subscription); //1. Might be removed by worker thread
if (future != null)
future.cancel(false);
else {
//3. Worker-thread view := cacheFutures.put() -> futures.remove()
//4. Controller-thread has seen futures.remove(), but has it seen cacheFutures.put()?
}
}
/*
* Only called from worker threads
*/
private void delay(Runnable runnable, Subscribable subscription, long delay) {
cacheFutures.put(subscription, 0); //2. Which is why it is cached first
Future<?> currentFuture = futures.remove(subscription);
if (currentFuture != null) {
currentFuture.cancel(false);
Future<?> future = threads.scheduleAtFixedRate(runnable, delay, subscription.getPeriod(), TimeUnit.SECONDS);
futures.put(subscription, future);
}
}
private Runnable getThread(Subscribable subscription) {
return new Runnable() {
public void run() {
//Do work...
boolean someCondition = true;
long someDelay = 100;
if (someCondition) {
delay(this, subscription, someDelay);
}
}
};
}
public interface Subscribable {
long getInitialDelay();
long getPeriod();
}
}
So the class permits to:
Subscribe to new tasks
Unsubscribe from existing tasks
Delay a periodically executed task
Subscriptions are added/removed by an external controlling thread, but delays are incurred only by the internal worker threads. This could happen, if for instance a worker thread found no update from the last execution or e.g. if the thread only needs to execute from 00.00 - 23.00.
My problem is that a worker thread may call delay() and remove its future from the ConcurrentMap, and the controller thread may concurrently call unsubscribe(). Then if the controller thread checks the ConcurrentMap before the worker thread has put in a new future, the unsubscribe() call will be lost.
There are some (not exhaustive list perhaps) solutions:
Use a lock between the delay() and unsubscribe() methods
Same as above, but one lock per subscribtion
(preferred?) Use no locks, but "cache" removed futures in the delay() method
As for the third solution, since the worker-thread has established the happens-before relationship cacheFutures.put() -> futures.remove(), and the atomicity of ConcurrentMap makes the controller thread see futures.remove(), does it also see the same happens-before relationship as the worker thread? I.e. cacheFutures.put() -> futures.remove()? Or does the atomicity only hold for the futures map with updates to other variables being propagated later?
Any other comments are also welcome, esp. considering use of the volatile keyword. Should the cache-map be declared volatile? thanks!
One lock per subscription would require you to maintain yet another map, and possibly thereby to introduce additional concurrency issues. I think that would be better avoided. The same applies even more so to caching removed subscriptions, plus that affords the added risk of unwanted resource retention (and note that it's not the Futures themselves that you would need to cache, but rather the Subscribables with which they are associated).
Any way around, you will need some kind of synchronization / locking. For example, in your option (3) you need to avoid an unsubscribe() for a given subscription happening between delay() caching that subscription and removing its Future. The only way you could avoid that without some form of locking would be if you could use just one Future per subscription, kept continuously in place from the time it is enrolled by subscribe() until it is removed by unsubscribe(). Doing so is not consistent with the ability to delay an already-scheduled subscription.
As for the third solution, since the worker-thread has established the happens-before relationship cacheFutures.put() -> futures.remove(), and the atomicity of ConcurrentMap makes the controller thread see futures.remove(), does it also see the same happens-before relationship as the worker thread?
Happens-before is a relationship between actions in an execution of a program. It is not specific to any one thread's view of the execution.
Or does the atomicity only hold for the futures map with updates to other variables being propagated later?
The controller thread will always see the cacheFutures.put() performed by an invocation of delay() occuring before the futures.remove() performed by that same invocation. I don't think that helps you, though.
Should the cache-map be declared volatile?
No. That would avail nothing, because although the contents of that map change, the map itself is always the same object, and the reference to it does not change.
You could consider having subscribe(), delay(), and unsubscribe() each synchronize on the Subscribable presented. That's not what I understood you to mean about having a lock per subscription, but it is similar. It would avoid the need for a separate data structure to maintain such locks. I guess you could also build locking methods into the Subscribable interface if you want to avoid explicit synchronization.
You have a ConcurrentMap but you aren't using it. Consider something along these lines:
import java.util.concurrent.ConcurrentHashMap;
import java.util.concurrent.ConcurrentMap;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.Executors;
import java.util.concurrent.Future;
import java.util.concurrent.FutureTask;
import java.util.concurrent.ScheduledExecutorService;
import java.util.concurrent.TimeUnit;
final class SO33555545
{
public static void main(String... argv)
throws InterruptedException
{
ScheduledExecutorService workers = Executors.newScheduledThreadPool(16);
Subscriptions sub = new Subscriptions(workers);
sub.subscribe(() -> System.out.println("Message received: A"));
sub.subscribe(() -> System.out.println("Message received: B"));
Thread.sleep(TimeUnit.SECONDS.toMillis(30));
workers.shutdown();
}
}
final class Subscriptions
{
private final ConcurrentMap<Subscribable, Task> tasks = new ConcurrentHashMap<>();
private final ScheduledExecutorService workers;
public Subscriptions(ScheduledExecutorService workers)
{
this.workers = workers;
}
void subscribe(Subscribable sub)
{
Task task = new Task(sub);
Task current = tasks.putIfAbsent(sub, task);
if (current != null)
throw new IllegalStateException("Already subscribed");
task.activate();
}
private Future<?> schedule(Subscribable sub)
{
Runnable task = () -> {
sub.invoke();
if (Math.random() < 0.25) {
System.out.println("Delaying...");
delay(sub, 5);
}
};
return workers.scheduleAtFixedRate(task, sub.getPeriod(), sub.getPeriod(), TimeUnit.SECONDS);
}
void unsubscribe(Subscribable sub)
{
Task task = tasks.remove(sub);
if (task != null)
task.cancel();
}
private void delay(Subscribable sub, long delay)
{
Task task = new Task(sub);
Task obsolete = tasks.replace(sub, task);
if (obsolete != null) {
obsolete.cancel();
task.activate();
}
}
private final class Task
{
private final FutureTask<Future<?>> future;
Task(Subscribable sub)
{
this.future = new FutureTask<>(() -> schedule(sub));
}
void activate()
{
future.run();
}
void cancel()
{
boolean interrupted = false;
while (true) {
try {
future.get().cancel(false);
break;
}
catch (ExecutionException ignore) {
ignore.printStackTrace(); /* Cancellation is unnecessary. */
break;
}
catch (InterruptedException ex) {
interrupted = true; /* Keep waiting... */
}
}
if (interrupted)
Thread.currentThread().interrupt(); /* Reset interrupt state. */
}
}
}
#FunctionalInterface
interface Subscribable
{
default long getPeriod()
{
return 4;
}
void invoke();
}
Suppose I have an ExecutorService (which can be a thread pool, so there's concurrency involved) which executes a task at various times, either periodically or in response to some other condition. The task to be executed is the following:
if this task is already in progress, do nothing (and let the previously-running task finish).
if this task is not already in progress, run Algorithm X, which can take a long time.
I'm trying to think of a way to implement this. It should be something like:
Runnable task = new Runnable() {
final SomeObj inProgress = new SomeObj();
#Override public void run() {
if (inProgress.acquire())
{
try
{
algorithmX();
}
finally
{
inProgress.release();
}
}
}
}
// re-use this task object whenever scheduling the task with the executor
where SomeObj is either a ReentrantLock (acquire = tryLock() and release = unlock()) or an AtomicBoolean or something, but I'm not sure which. Do I need a ReentrantLock here? (Maybe I want a non-reentrant lock in case algorithmX() causes this task to be run recursively!) Or would an AtomicBoolean be enough?
edit: for a non-reentrant lock, is this appropriate?
Runnable task = new Runnable() {
boolean inProgress = false;
final private Object lock = new Object();
/** try to acquire lock: set inProgress to true,
* return whether it was previously false
*/
private boolean acquire() {
synchronized(this.lock)
{
boolean result = !this.inProgress;
this.inProgress = true;
return result;
}
}
/** release lock */
private void release() {
synchronized(this.lock)
{
this.inProgress = false;
}
}
#Override public void run() {
if (acquire())
{
// nobody else is running! let's do algorithmX()
try
{
algorithmX();
}
finally
{
release();
}
}
/* otherwise, we are already in the process of
* running algorithmX(), in this thread or in another,
* so don't do anything, just return control to the caller.
*/
}
}
The lock implementation you suggest is weak in the sense that it would be quite easy for someone to use it improperly.
Below is a much more efficient implementation with the same improper use weaknesses as your implementation:
AtomicBoolean inProgress = new AtomicBoolean(false)
/* Returns true if we acquired the lock */
private boolean acquire() {
return inProgress.compareAndSet(false, true);
}
/** Always release lock without determining if we in fact hold it */
private void release() {
inProgress.set(false);
}
Your first bit of code looks pretty good, but if you're worried about algorithmX recursively invoking the task, I would suggest you use a java.util.concurrent.Semaphore as the synchronization object, rather than a ReentrantLock. For example:
Runnable task = new Runnable() {
final Semaphore lock = new Semaphore( 1 );
#Override public void run() {
if (lock.tryAcquire())
{
try
{
algorithmX();
}
finally
{
lock.release();
}
}
}
}
Note in particular the use of tryacquire. If acquiring the lock fails, algorithmX is not run.
ReentrantLock seems fine to me. The only situation where I'd find interesting to manually create a lock using AtomicInteger will be if you have a really short algorithmX which is not your case.
I think the secret of choosing the right lock impl is this:
* if this task is already in progress, do nothing (and let the previously-running task finish).
What does "do nothing" mean in this context? Thread should block and retry execution after running algorithmX is finished?. If this is the case semaphore.acquire instead of tryAcquire should be used and AtomicBoolean solution won't work as expected.