Modifying a runnable object after it has been submitted to ExecutorService? - java

Is it possible to modify the runnable object after it has been submitted to the executor service (single thread with unbounded queue) ?
For example:
public class Test {
#Autowired
private Runner userRunner;
#Autowired
private ExecutorService executorService;
public void init() {
for (int i = 0; i < 100; ++i) {
userRunner.add("Temp" + i);
Future runnerFuture = executorService.submit(userRunner);
}
}
}
public class Runner implements Runnable {
private List<String> users = new ArrayList<>();
public void add(String user) {
users.add(user);
}
public void run() {
/* Something here to do with users*/
}
}
As you can see in the above example, if we submit a runnable object and modify the contents of the object too inside the loop, will the 1st submit to executor service use the newly added users. Consider that the run method is doing something really intensive and subsequent submits are queued.

if we submit a runnable object and modify the contents of the object too inside the loop, will the 1st submit to executor service use the newly added users.
Only if the users ArrayList is properly synchronized. What you are doing is trying to modify the users field from two different threads which can cause exceptions and other unpredictable results. Synchronization ensures mutex so multiple threads aren't changing ArrayList at the same time unexpectedly, as well as memory synchronization which ensures that one thread's modifications are seen by the other.
What you could do is to add synchronization to your example:
public void add(String user) {
synchronized (users) {
users.add(user);
}
}
...
public void run() {
synchronized (users) {
/* Something here to do with users*/
}
}
Another option would be to synchronize the list:
// you can't use this if you are iterating on this list (for, etc.)
private List<String> users = Collections.synchronizedList(new ArrayList<>());
However, you'll need to manually synchronize if you are using a for loop on the list or otherwise iterating across it.

The cleanest, most straightforward approach would be to call cancel on the Future, then submit a new task with the updated user list. Otherwise not only do you face visibility issues from tampering with the list across threads, but there's no way to know if you're modifying a task that's already running.

Related

How to find future in Spring ThreadPoolTaskScheduler

I'm using Spring ThreadPoolTaskScheduler and I need to find and cancel future by some condition.
is it right to have a ScheduledFuture field in Runnable task and collect tasks into ArrayList? Should I use CopyOnWriteArrayList?
class Task implements Runnable {
public Task(int id) {
this.id = id;
}
private final int id;
private ScheduledFuture future;
public void setFuture(ScheduledFuture future) {
this.future = future;
}
public ScheduledFuture getFuture() {
return future;
}
public int getId() {
return this.id;
}
public void run() {
System.out.println(this.id);
}
}
#Service
#RequiredArgsConstructor
public class ServiceTest {
private final ThreadPoolTaskScheduler threadPoolTaskScheduler;
private final ArrayList<Task> tasks = new ArrayList<Task>();
#PostConstruct
public void registerTasks() {
for (int i = 0; i < 3; i++) {
Task task = new Task(i);
ScheduledFuture future = threadPoolTaskScheduler.schedule(task, new
PeriodicTrigger(100));
task.setFuture(future);
tasks.add(task);
}
}
public void stopTask(int id) {
Iterator<Task> it = tasks.iterator();
while(it.hasNext()) {
Task task = it.next();
if (task.getId() == id) {
task.getFuture().cancel(false);
it.remove();
}
}
}
}
is it right to have a ScheduledFuture field in Runnable task?
From purely technical standpoint, I don't think there is any issue in storing a ScheduledFuture in the instance of the Task class which is implementing Runnable, because your Task class instance will just store it as state information, which you can use later in your code like you have done in the stopTask() method. Also, just to note here you are using a PeriodicTrigger which means the thread will keep on executing after the 100 ms time interval provided, unless the task is cancelled.
NOTE: Please ensure that your actual run() method does not change class variable future in any way. (although this is not done in the question, your real code should also not have any such changes to the future variable from inside the run() method)
collect tasks into ArrayList? Should I use CopyOnWriteArrayList?
There is no harm in using an ArrayList here, as you are using this out of any multi-threaded context. If this list was supposed to be used in a multi-threaded context, then you could probably think of using CopyOnWriteArrayList. To be precise, your ServiceTest class is having a registerTasks() method which is modifying/accessing your list, and this method is not being invoked in a multi-threaded manner as per the code shown in the question. So, there is no need for CopyOnWriteArrayList here.
Also, if needed, you could also check whether the task was cancelled or not using the returned boolean value from the call to cancel method. You may want to use it to do any further actions.
UPDATE:
Oh well, I overlooked that modification via the iterator. I agree that the stopTask() method could be accessed in a multi-threaded manner if it can be invoked via a REST Controller. However, CopyOnWriteArrayList does not support remove() method on its iterator() just in case you are thinking of using that. Also, using CopyOnWriteArrayList is only advisable if you have a majority of read operations than write operations on the list. It is actually meant for safe traversals where majority are read operations using the iterator. I would suggest synchronizing stopTask() method or using Collections.synchronizedList(tasks)

How to manage threads in Spring TaskExecutor framework

I have a BlockingQueue of Runnable - I can simply execute all tasks using one of TaskExecutor implementations, and all will be run in parallel.
However some Runnable depends on others, it means they need to wait when Runnable finish, then they can be executed.
Rule is quite simple: every Runnable has a code. Two Runnable with the same code cannot be run simultanously, but if the code differ they should be run in parallel.
In other words all running Runnable need to have different code, all "duplicates" should wait.
The problem is that there's no event/method/whatsoever when thread ends.
I can built such notification into every Runnable, but I don't like this approach, because it will be done just before thread ends, not after it's ended
java.util.concurrent.ThreadPoolExecutor has method afterExecute, but it needs to be implemented - Spring use only default implementation, and this method is ignored.
Even if I do that, it's getting complicated, because I need to track two additional collections: with Runnables already executing (no implementation gives access to this information) and with those postponed because they have duplicated code.
I like the BlockingQueue approach because there's no polling, thread simply activate when something new is in the queue. But maybe there's a better approach to manage such dependencies between Runnables, so I should give up with BlockingQueue and use different strategy?
If the number of different codes is not that large, the approach with a separate single thread executor for each possible code, offered by BarrySW19, is fine.
If the whole number of threads become unacceptable, then, instead of single thread executor, we can use an actor (from Akka or another similar library):
public class WorkerActor extends UntypedActor {
public void onReceive(Object message) {
if (message instanceof Runnable) {
Runnable work = (Runnable) message;
work.run();
} else {
// report an error
}
}
}
As in the original solution, ActorRefs for WorkerActors are collected in a HashMap. When an ActorRef workerActorRef corresponding to the given code is obtained (retrieved or created), the Runnable job is submitted to execution with workerActorRef.tell(job).
If you don't want to have a dependency to the actor library, you can program WorkerActor from scratch:
public class WorkerActor implements Runnable, Executor {
Executor executor=ForkJoinPool.commonPool(); // or can by assigned in constructor
LinkedBlockingQueue<Runnable> queue = new LinkedBlockingQueu<>();
boolean running = false;
public synchronized void execute(Runnable job) {
queue.put(job);
if (!running) {
executor.execute(this); // execute this worker, not job!
running=true;
}
public void run() {
for (;;) {
Runnable work=null;
synchronized (this) {
work = queue.poll();
if (work==null) {
running = false;
return;
}
}
work.run();
}
}
}
When a WorkerActor worker corresponding to the given code is obtained (retrieved or created), the Runnable job is submitted to execution with worker.execute(job).
One alternate strategy which springs to mind is to have a separate single thread executor for each possible code. Then, when you want to submit a new Runnable you simply lookup the correct executor to use for its code and submit the job.
This may, or may not be a good solution depending on how many different codes you have. The main thing to consider would be that the number of concurrent threads running could be as high as the number of different codes you have. If you have many different codes this could be a problem.
Of course, you could use a Semaphore to restrict the number of concurrently running jobs; you would still create one thread per code, but only a limited number could actually execute at the same time. For example, this would serialise jobs by code, allowing up to three different codes to run concurrently:
public class MultiPoolExecutor {
private final Semaphore semaphore = new Semaphore(3);
private final ConcurrentMap<String, ExecutorService> serviceMap
= new ConcurrentHashMap<>();
public void submit(String code, Runnable job) {
ExecutorService executorService = serviceMap.computeIfAbsent(
code, (k) -> Executors.newSingleThreadExecutor());
executorService.submit(() -> {
semaphore.acquireUninterruptibly();
try {
job.run();
} finally {
semaphore.release();
}
});
}
}
Another approach would be to modify the Runnable to release a lock and check for jobs which could be run upon completion (so avoiding polling) - something like this example, which keeps all the jobs in a list until they can be submitted. The boolean latch ensures only one job for each code has been submitted to the thread pool at any one time. Whenever a new job arrives or a running one completes the code checks again for new jobs which can be submitted (the CodedRunnable is simply an extension of Runnable which has a code property).
public class SubmissionService {
private final ExecutorService executorService = Executors.newFixedThreadPool(5);
private final ConcurrentMap<String, AtomicBoolean> locks = new ConcurrentHashMap<>();
private final List<CodedRunnable> jobs = new ArrayList<>();
public void submit(CodedRunnable codedRunnable) {
synchronized (jobs) {
jobs.add(codedRunnable);
}
submitWaitingJobs();
}
private void submitWaitingJobs() {
synchronized (jobs) {
for(Iterator<CodedRunnable> iter = jobs.iterator(); iter.hasNext(); ) {
CodedRunnable nextJob = iter.next();
AtomicBoolean latch = locks.computeIfAbsent(
nextJob.getCode(), (k) -> new AtomicBoolean(false));
if(latch.compareAndSet(false, true)) {
iter.remove();
executorService.submit(() -> {
try {
nextJob.run();
} finally {
latch.set(false);
submitWaitingJobs();
}
});
}
}
}
}
}
The downside of this approach is that the code needs to scan through the entire list of waiting jobs after each task completes. Of course, you could make this more efficient - a completing task would actually only need to check for other jobs with the same code, so the jobs could be stored in a Map<String, List<Runnable>> structure instead to allow for faster processing.

Java, Thread - Synchronized variable

How do I create a common variable between threads?
For example: Many threads sending a request to server to create users.
These users are saved in an ArrayList, but this ArrayList must be synchronized for all threads. How can I do it ?
Thanks all!
If you are going to access the list from multiple threads, you can use Collections to wrap it:
List<String> users = Collections.synchronizedList(new ArrayList<String>());
and then simply pass it in a constructor to the threads that will use it.
I would use an ExecutorService and submit tasks to it you want to perform. This way you don't need a synchronized collection (possibly don't need the collection at all)
However, you can do what you suggest by creating an ArrayList wrapped with a Collections.synchronizedList() and pass this as a reference to the thread before you start it.
What you could do is something like
// can be reused for other background tasks.
ExecutorService executor = Executors.newFixedThreadPool(numThreads);
List<Future<User>> userFutures = new ArrayList<>();
for( users to create )
userFutures.add(executor.submit(new Callable<User>() {
public User call() {
return created user;
}
});
List<User> users = new ArrayList<>();
for(Future<User> userFuture: userFutures)
users.add(userFuture.get();
To expand on #Peter's answer, if you use an ExecutorService you can submit a Callable<User> which can return the User that was created by the task run in another thread.
Something like:
// create a thread pool with 10 background threads
ExecutorService threadPool = Executors.newFixedThreadPool(10);
List<Future<User>> futures = new ArrayList<Future<User>>();
for (String userName : userNamesToCreateCollection) {
futures.add(threadPool.submit(new MyCallable(userName)));
}
// once you submit all of the jobs, we shutdown the pool, current jobs still run
threadPool.shutdown();
// now we wait for the produced users
List<User> users = new ArrayList<User>();
for (Future<User> future : futures) {
// this waits for the job to complete and gets the User created
// it also throws some exceptions that need to be caught/logged
users.add(future.get());
}
...
private static class MyCallable implements Callable<User> {
private String userName;
public MyCallable(String userName) {
this.userName = userName;
}
public User call() {
// create the user...
return user;
}
}

How to run a Listener in a different thread or do its calculation in a different thread

I'm trying to build a cache with Google Guava and want to do some calculation on the expired objects. A removalListener notifies me, if some object was removed.
How can I run the removalListener in a different thread than the main application or pass the expired object (in the simple example below, that would be the Integer 3) to a different thread that handles the calculation?
Edit: As the calculation is rather short, but happens often, I would rather not create a new thread each time (would be thousands of threads), but have one (or maybe two) who calculate all objects.
Simple example:
Cache<String, Integer> cache = CacheBuilder.newBuilder().maximumSize(100)
.expireAfterAccess(100, TimeUnit.NANOSECONDS)
.removalListener(new RemovalListener<String, Integer>() {
public void onRemoval(final RemovalNotification notification) {
if (notification.getCause() == RemovalCause.EXPIRED) {
System.out.println("removed " + notification.getValue());
// do calculation=> this should be in another thread
}
}
})
.build();
cache.put("test1", 3);
cache.cleanUp();
To run your listener in an executor, wrap it with RemovalListeners.asynchronous.
.removalListener(asynchronous(new RemovalListener() { ... }, executor))
Create an ExecutorService using one of the Executors factory methods, and submit a new Runnable to this executor each time you need to:
private ExecutorService executor = Executors.newSingleThreadExecutor();
...
public void onRemoval(final RemovalNotification notification) {
if (notification.getCause() == RemovalCause.EXPIRED) {
System.out.println("removed " + notification.getValue());
submitCalculation(notification.getValue());
}
}
private void submitCalculation(final Integer value) {
Runnable task = new Runnable() {
#Override
public void run() {
// call your calculation here
}
};
executor.submit(task);
}
You can create a new class, and implement the java.utils.Runnable interface like so;
public class MyWorkerThread implements Runnable {
public MyWorkerThread(/*params*/) {
//set your instance variables here
//then start the thread
(new Thread(this)).start();
}
public void run() {
//do useful things
}
}
When you create a new MyWorkerThread by calling the constructor, execution is returned to the calling code as soon as the constructor is finished, and a separate thread is started that runs the code inside the run() method.
If you might want to create MyWorkerThread objects without immediately starting them off, you can remove the Thread.start() code from the constructor, and call the thread manually from the instance later like so;
MyWorkerThread t = new MyWorkerThread();
//later
(new Thread(t)).start();
Or if you want to keep a reference to the Thread object so you can do groovy things like interrupt and join, do it like so;
Thread myThread = new Thread(t);
myThread.start();
//some other time
myThread.interrupt();
you can simply create intermediate queue for expired entities (expiration listener will just add expired object to this queue) - say some sort of blocking in-memory queue - ArrayBlockingQueue, LinkedBlockingDeque.
Then you can setup thread-pool and handlers(with configurable size) that will consume objects using poll() method.
For high-performance queue - i can advice more advanced non-blocking queue implementation if needed. also you can read more about high-performance non-blocking queues here Add the first element to a ConcurrentLinkedQueue atomically
Use an executor service to dispatch your task to a different thread.
ExecutorService have an internal blocking queue that is used for safe publishing of references between the producer and the consumer threads. The factory class Executors can be used to create different ExecutorService with different thread management strategies.
private ExecutorService cleanupExecutor = Executors.newFixedThreadPool(CLEANUP_THREADPOOL_SIZE);
...
public void onRemoval(final RemovalNotification notification) {
if (notification.getCause() == RemovalCause.EXPIRED) {
System.out.println("removed " + notification.getValue());
doAsyncCalculation(notification.getValue());
}
}
private void doAsyncCalculation(final Object obj) {
cleanupExecutor.submit(new Runnable() {
public void run() {
expensiveOperation(obj);
}
}
}
In doAsyncCalculation you are creating new tasks to be run but not new threads. The executor service takes care of dispatching the task to the threads in the executorService's associated thread pool.

Return values from Java Threads

I have a Java Thread like the following:
public class MyThread extends Thread {
MyService service;
String id;
public MyThread(String id) {
this.id = node;
}
public void run() {
User user = service.getUser(id)
}
}
I have about 300 ids, and every couple of seconds - I fire up threads to make a call for each of the id. Eg.
for(String id: ids) {
MyThread thread = new MyThread(id);
thread.start();
}
Now, I would like to collect the results from each threads, and do a batch insert to the database, instead of making 300 database inserts every 2 seconds.
Any idea how I can accomplish this?
The canonical approach is to use a Callable and an ExecutorService. submitting a Callable to an ExecutorService returns a (typesafe) Future from which you can get the result.
class TaskAsCallable implements Callable<Result> {
#Override
public Result call() {
return a new Result() // this is where the work is done.
}
}
ExecutorService executor = Executors.newFixedThreadPool(300);
Future<Result> task = executor.submit(new TaskAsCallable());
Result result = task.get(); // this blocks until result is ready
In your case, you probably want to use invokeAll which returns a List of Futures, or create that list yourself as you add tasks to the executor. To collect results, simply call get on each one.
If you want to collect all of the results before doing the database update, you can use the invokeAll method. This takes care of the bookkeeping that would be required if you submit tasks one at a time, like daveb suggests.
private static final ExecutorService workers = Executors.newCachedThreadPool();
...
Collection<Callable<User>> tasks = new ArrayList<Callable<User>>();
for (final String id : ids) {
tasks.add(new Callable<User>()
{
public User call()
throws Exception
{
return svc.getUser(id);
}
});
}
/* invokeAll blocks until all service requests complete,
* or a max of 10 seconds. */
List<Future<User>> results = workers.invokeAll(tasks, 10, TimeUnit.SECONDS);
for (Future<User> f : results) {
User user = f.get();
/* Add user to batch update. */
...
}
/* Commit batch. */
...
Store your result in your object. When it completes, have it drop itself into a synchronized collection (a synchronized queue comes to mind).
When you wish to collect your results to submit, grab everything from the queue and read your results from the objects. You might even have each object know how to "post" it's own results to the database, this way different classes can be submitted and all handled with the exact same tiny, elegant loop.
There are lots of tools in the JDK to help with this, but it is really easy once you start thinking of your thread as a true object and not just a bunch of crap around a "run" method. Once you start thinking of objects this way programming becomes much simpler and more satisfying.
In Java8 there is better way for doing this using CompletableFuture. Say we have class that get's id from the database, for simplicity we can just return a number as below,
static class GenerateNumber implements Supplier<Integer>{
private final int number;
GenerateNumber(int number){
this.number = number;
}
#Override
public Integer get() {
try {
TimeUnit.SECONDS.sleep(1);
}catch (InterruptedException e){
e.printStackTrace();
}
return this.number;
}
}
Now we can add the result to a concurrent collection once the results of every future is ready.
Collection<Integer> results = new ConcurrentLinkedQueue<>();
int tasks = 10;
CompletableFuture<?>[] allFutures = new CompletableFuture[tasks];
for (int i = 0; i < tasks; i++) {
int temp = i;
CompletableFuture<Integer> future = CompletableFuture.supplyAsync(()-> new GenerateNumber(temp).get(), executor);
allFutures[i] = future.thenAccept(results::add);
}
Now we can add a callback when all the futures are ready,
CompletableFuture.allOf(allFutures).thenAccept(c->{
System.out.println(results); // do something with result
});
You need to store the result in a something like singleton. This has to be properly synchronized.
This not the best advice as it is not good idea to handle raw Threads.
You could create a queue or list which you pass to the threads you create, the threads add their result to the list which gets emptied by a consumer which performs the batch insert.
The simplest approach is to pass an object to each thread (one object per thread) that will contain the result later. The main thread should keep a reference to each result object. When all threads are joined, you can use the results.
public class TopClass {
List<User> users = new ArrayList<User>();
void addUser(User user) {
synchronized(users) {
users.add(user);
}
}
void store() throws SQLException {
//storing code goes here
}
class MyThread extends Thread {
MyService service;
String id;
public MyThread(String id) {
this.id = node;
}
public void run() {
User user = service.getUser(id)
addUser(user);
}
}
}
You could make a class which extends Observable. Then your thread can call a method in the Observable class which would notify any classes that registered in that observer by calling Observable.notifyObservers(Object).
The observing class would implement Observer, and register itself with the Observable. You would then implement an update(Observable, Object) method that gets called when Observerable.notifyObservers(Object) is called.

Categories