Java, Thread - Synchronized variable - java

How do I create a common variable between threads?
For example: Many threads sending a request to server to create users.
These users are saved in an ArrayList, but this ArrayList must be synchronized for all threads. How can I do it ?
Thanks all!

If you are going to access the list from multiple threads, you can use Collections to wrap it:
List<String> users = Collections.synchronizedList(new ArrayList<String>());
and then simply pass it in a constructor to the threads that will use it.

I would use an ExecutorService and submit tasks to it you want to perform. This way you don't need a synchronized collection (possibly don't need the collection at all)
However, you can do what you suggest by creating an ArrayList wrapped with a Collections.synchronizedList() and pass this as a reference to the thread before you start it.
What you could do is something like
// can be reused for other background tasks.
ExecutorService executor = Executors.newFixedThreadPool(numThreads);
List<Future<User>> userFutures = new ArrayList<>();
for( users to create )
userFutures.add(executor.submit(new Callable<User>() {
public User call() {
return created user;
}
});
List<User> users = new ArrayList<>();
for(Future<User> userFuture: userFutures)
users.add(userFuture.get();

To expand on #Peter's answer, if you use an ExecutorService you can submit a Callable<User> which can return the User that was created by the task run in another thread.
Something like:
// create a thread pool with 10 background threads
ExecutorService threadPool = Executors.newFixedThreadPool(10);
List<Future<User>> futures = new ArrayList<Future<User>>();
for (String userName : userNamesToCreateCollection) {
futures.add(threadPool.submit(new MyCallable(userName)));
}
// once you submit all of the jobs, we shutdown the pool, current jobs still run
threadPool.shutdown();
// now we wait for the produced users
List<User> users = new ArrayList<User>();
for (Future<User> future : futures) {
// this waits for the job to complete and gets the User created
// it also throws some exceptions that need to be caught/logged
users.add(future.get());
}
...
private static class MyCallable implements Callable<User> {
private String userName;
public MyCallable(String userName) {
this.userName = userName;
}
public User call() {
// create the user...
return user;
}
}

Related

How to manage threads in Spring TaskExecutor framework

I have a BlockingQueue of Runnable - I can simply execute all tasks using one of TaskExecutor implementations, and all will be run in parallel.
However some Runnable depends on others, it means they need to wait when Runnable finish, then they can be executed.
Rule is quite simple: every Runnable has a code. Two Runnable with the same code cannot be run simultanously, but if the code differ they should be run in parallel.
In other words all running Runnable need to have different code, all "duplicates" should wait.
The problem is that there's no event/method/whatsoever when thread ends.
I can built such notification into every Runnable, but I don't like this approach, because it will be done just before thread ends, not after it's ended
java.util.concurrent.ThreadPoolExecutor has method afterExecute, but it needs to be implemented - Spring use only default implementation, and this method is ignored.
Even if I do that, it's getting complicated, because I need to track two additional collections: with Runnables already executing (no implementation gives access to this information) and with those postponed because they have duplicated code.
I like the BlockingQueue approach because there's no polling, thread simply activate when something new is in the queue. But maybe there's a better approach to manage such dependencies between Runnables, so I should give up with BlockingQueue and use different strategy?
If the number of different codes is not that large, the approach with a separate single thread executor for each possible code, offered by BarrySW19, is fine.
If the whole number of threads become unacceptable, then, instead of single thread executor, we can use an actor (from Akka or another similar library):
public class WorkerActor extends UntypedActor {
public void onReceive(Object message) {
if (message instanceof Runnable) {
Runnable work = (Runnable) message;
work.run();
} else {
// report an error
}
}
}
As in the original solution, ActorRefs for WorkerActors are collected in a HashMap. When an ActorRef workerActorRef corresponding to the given code is obtained (retrieved or created), the Runnable job is submitted to execution with workerActorRef.tell(job).
If you don't want to have a dependency to the actor library, you can program WorkerActor from scratch:
public class WorkerActor implements Runnable, Executor {
Executor executor=ForkJoinPool.commonPool(); // or can by assigned in constructor
LinkedBlockingQueue<Runnable> queue = new LinkedBlockingQueu<>();
boolean running = false;
public synchronized void execute(Runnable job) {
queue.put(job);
if (!running) {
executor.execute(this); // execute this worker, not job!
running=true;
}
public void run() {
for (;;) {
Runnable work=null;
synchronized (this) {
work = queue.poll();
if (work==null) {
running = false;
return;
}
}
work.run();
}
}
}
When a WorkerActor worker corresponding to the given code is obtained (retrieved or created), the Runnable job is submitted to execution with worker.execute(job).
One alternate strategy which springs to mind is to have a separate single thread executor for each possible code. Then, when you want to submit a new Runnable you simply lookup the correct executor to use for its code and submit the job.
This may, or may not be a good solution depending on how many different codes you have. The main thing to consider would be that the number of concurrent threads running could be as high as the number of different codes you have. If you have many different codes this could be a problem.
Of course, you could use a Semaphore to restrict the number of concurrently running jobs; you would still create one thread per code, but only a limited number could actually execute at the same time. For example, this would serialise jobs by code, allowing up to three different codes to run concurrently:
public class MultiPoolExecutor {
private final Semaphore semaphore = new Semaphore(3);
private final ConcurrentMap<String, ExecutorService> serviceMap
= new ConcurrentHashMap<>();
public void submit(String code, Runnable job) {
ExecutorService executorService = serviceMap.computeIfAbsent(
code, (k) -> Executors.newSingleThreadExecutor());
executorService.submit(() -> {
semaphore.acquireUninterruptibly();
try {
job.run();
} finally {
semaphore.release();
}
});
}
}
Another approach would be to modify the Runnable to release a lock and check for jobs which could be run upon completion (so avoiding polling) - something like this example, which keeps all the jobs in a list until they can be submitted. The boolean latch ensures only one job for each code has been submitted to the thread pool at any one time. Whenever a new job arrives or a running one completes the code checks again for new jobs which can be submitted (the CodedRunnable is simply an extension of Runnable which has a code property).
public class SubmissionService {
private final ExecutorService executorService = Executors.newFixedThreadPool(5);
private final ConcurrentMap<String, AtomicBoolean> locks = new ConcurrentHashMap<>();
private final List<CodedRunnable> jobs = new ArrayList<>();
public void submit(CodedRunnable codedRunnable) {
synchronized (jobs) {
jobs.add(codedRunnable);
}
submitWaitingJobs();
}
private void submitWaitingJobs() {
synchronized (jobs) {
for(Iterator<CodedRunnable> iter = jobs.iterator(); iter.hasNext(); ) {
CodedRunnable nextJob = iter.next();
AtomicBoolean latch = locks.computeIfAbsent(
nextJob.getCode(), (k) -> new AtomicBoolean(false));
if(latch.compareAndSet(false, true)) {
iter.remove();
executorService.submit(() -> {
try {
nextJob.run();
} finally {
latch.set(false);
submitWaitingJobs();
}
});
}
}
}
}
}
The downside of this approach is that the code needs to scan through the entire list of waiting jobs after each task completes. Of course, you could make this more efficient - a completing task would actually only need to check for other jobs with the same code, so the jobs could be stored in a Map<String, List<Runnable>> structure instead to allow for faster processing.

Modifying a runnable object after it has been submitted to ExecutorService?

Is it possible to modify the runnable object after it has been submitted to the executor service (single thread with unbounded queue) ?
For example:
public class Test {
#Autowired
private Runner userRunner;
#Autowired
private ExecutorService executorService;
public void init() {
for (int i = 0; i < 100; ++i) {
userRunner.add("Temp" + i);
Future runnerFuture = executorService.submit(userRunner);
}
}
}
public class Runner implements Runnable {
private List<String> users = new ArrayList<>();
public void add(String user) {
users.add(user);
}
public void run() {
/* Something here to do with users*/
}
}
As you can see in the above example, if we submit a runnable object and modify the contents of the object too inside the loop, will the 1st submit to executor service use the newly added users. Consider that the run method is doing something really intensive and subsequent submits are queued.
if we submit a runnable object and modify the contents of the object too inside the loop, will the 1st submit to executor service use the newly added users.
Only if the users ArrayList is properly synchronized. What you are doing is trying to modify the users field from two different threads which can cause exceptions and other unpredictable results. Synchronization ensures mutex so multiple threads aren't changing ArrayList at the same time unexpectedly, as well as memory synchronization which ensures that one thread's modifications are seen by the other.
What you could do is to add synchronization to your example:
public void add(String user) {
synchronized (users) {
users.add(user);
}
}
...
public void run() {
synchronized (users) {
/* Something here to do with users*/
}
}
Another option would be to synchronize the list:
// you can't use this if you are iterating on this list (for, etc.)
private List<String> users = Collections.synchronizedList(new ArrayList<>());
However, you'll need to manually synchronize if you are using a for loop on the list or otherwise iterating across it.
The cleanest, most straightforward approach would be to call cancel on the Future, then submit a new task with the updated user list. Otherwise not only do you face visibility issues from tampering with the list across threads, but there's no way to know if you're modifying a task that's already running.

Synchronize on variable only when it is being updated

Usecase : Rotation of credentials for a datastore
What I want :
When updateCredentials is called, it will wait until it all threads are done fetching credentials (via the synchronize) to update the credentials to the new ones.
I DO NOT want calls to doSomeQuery making each other wait to fetch credentials. This object can be used in multiple threads and its a wasteful wait.
Is there a method / pattern to achieve this? The code sample below achieves item 1 but not item 2.
private Object credentialUpdate = new Object();
public void updateCredentials(String user, String pass) {
synchronize(credentialUpdate) {
this.user = user;
this.pass = pass;
}
}
public void doSomeQuery(String query) {
String curUser;
String curPass;
synchronize(credentialUpdate) {
curUser = this.user;
curPass;
}
// execute query
}
Use java.util.concurrent.locks.ReadWriteLock and its implementation ReentrantReadWriteLock. From the Javadoc:
A ReadWriteLock maintains a pair of associated locks, one for read-only operations and one for writing. The read lock may be held simultaneously by multiple reader threads, so long as there are no writers. The write lock is exclusive.

Multithreading best practices in java

I'm new to Java programming. I have a use case where I have to execute 2 db queries parallely. The structure of my class is something like this:
class A {
public Object func_1() {
//executes db query1
}
public Object func_2() {
//executes db query1
}
}
Now I have a add another function func_3 in the same class which calls these 2 functions but also makes sure that they execute parallely. For this, I'm making use callables and futures. Is it the right way to use it this way? I'm storing the this variable in a temporary variable and then using this to call func_1 and func_2 from func_3(which I'm not sure is correct approach). Or is there any other way to handle cases like these?
class A {
public Object func_1() {
//executes db query1
}
public Object func_2() {
//executes db query1
}
public void func_3() {
final A that = this;
Callable call1 = new Callable() {
#Override
public Object call() {
return that.func_1();
}
}
Callable call2 = new Callable() {
#Override
public Object call() {
return that.func_2();
}
}
ArrayList<Callable<Object>> list = new ArrayList<Callable<Object>>();
list.add(call1);
list.add(call2);
ExecutorService executor = Executors.newFixedThreadPool(2);
ArrayList<Future<Object>> futureList = new ArrayList<Future<Object>>();
futureList = (ArrayList<Future<Object>>) executor.invokeAll(list);
//process result accordingly
}
}
First of all, you do NOT need to store this in another local variable: outer functions will be available just as func_1() or func_2() and when you want to get this of outer class you just use A.this.
Secondly, yes, it is common way to do it. Also, if you are going to call func_3 often - avoid creating of fixed thread pool, you should just pass it as params, since thread creation is rather 'costly'.
The whole idea of Executor(Service) is to use small number of threads for many small tasks. Here you use 2-threaded executor for 2 tasks. I would either create globally defined executor, or just spawn 2 threads for 2 tasks.

Return values from Java Threads

I have a Java Thread like the following:
public class MyThread extends Thread {
MyService service;
String id;
public MyThread(String id) {
this.id = node;
}
public void run() {
User user = service.getUser(id)
}
}
I have about 300 ids, and every couple of seconds - I fire up threads to make a call for each of the id. Eg.
for(String id: ids) {
MyThread thread = new MyThread(id);
thread.start();
}
Now, I would like to collect the results from each threads, and do a batch insert to the database, instead of making 300 database inserts every 2 seconds.
Any idea how I can accomplish this?
The canonical approach is to use a Callable and an ExecutorService. submitting a Callable to an ExecutorService returns a (typesafe) Future from which you can get the result.
class TaskAsCallable implements Callable<Result> {
#Override
public Result call() {
return a new Result() // this is where the work is done.
}
}
ExecutorService executor = Executors.newFixedThreadPool(300);
Future<Result> task = executor.submit(new TaskAsCallable());
Result result = task.get(); // this blocks until result is ready
In your case, you probably want to use invokeAll which returns a List of Futures, or create that list yourself as you add tasks to the executor. To collect results, simply call get on each one.
If you want to collect all of the results before doing the database update, you can use the invokeAll method. This takes care of the bookkeeping that would be required if you submit tasks one at a time, like daveb suggests.
private static final ExecutorService workers = Executors.newCachedThreadPool();
...
Collection<Callable<User>> tasks = new ArrayList<Callable<User>>();
for (final String id : ids) {
tasks.add(new Callable<User>()
{
public User call()
throws Exception
{
return svc.getUser(id);
}
});
}
/* invokeAll blocks until all service requests complete,
* or a max of 10 seconds. */
List<Future<User>> results = workers.invokeAll(tasks, 10, TimeUnit.SECONDS);
for (Future<User> f : results) {
User user = f.get();
/* Add user to batch update. */
...
}
/* Commit batch. */
...
Store your result in your object. When it completes, have it drop itself into a synchronized collection (a synchronized queue comes to mind).
When you wish to collect your results to submit, grab everything from the queue and read your results from the objects. You might even have each object know how to "post" it's own results to the database, this way different classes can be submitted and all handled with the exact same tiny, elegant loop.
There are lots of tools in the JDK to help with this, but it is really easy once you start thinking of your thread as a true object and not just a bunch of crap around a "run" method. Once you start thinking of objects this way programming becomes much simpler and more satisfying.
In Java8 there is better way for doing this using CompletableFuture. Say we have class that get's id from the database, for simplicity we can just return a number as below,
static class GenerateNumber implements Supplier<Integer>{
private final int number;
GenerateNumber(int number){
this.number = number;
}
#Override
public Integer get() {
try {
TimeUnit.SECONDS.sleep(1);
}catch (InterruptedException e){
e.printStackTrace();
}
return this.number;
}
}
Now we can add the result to a concurrent collection once the results of every future is ready.
Collection<Integer> results = new ConcurrentLinkedQueue<>();
int tasks = 10;
CompletableFuture<?>[] allFutures = new CompletableFuture[tasks];
for (int i = 0; i < tasks; i++) {
int temp = i;
CompletableFuture<Integer> future = CompletableFuture.supplyAsync(()-> new GenerateNumber(temp).get(), executor);
allFutures[i] = future.thenAccept(results::add);
}
Now we can add a callback when all the futures are ready,
CompletableFuture.allOf(allFutures).thenAccept(c->{
System.out.println(results); // do something with result
});
You need to store the result in a something like singleton. This has to be properly synchronized.
This not the best advice as it is not good idea to handle raw Threads.
You could create a queue or list which you pass to the threads you create, the threads add their result to the list which gets emptied by a consumer which performs the batch insert.
The simplest approach is to pass an object to each thread (one object per thread) that will contain the result later. The main thread should keep a reference to each result object. When all threads are joined, you can use the results.
public class TopClass {
List<User> users = new ArrayList<User>();
void addUser(User user) {
synchronized(users) {
users.add(user);
}
}
void store() throws SQLException {
//storing code goes here
}
class MyThread extends Thread {
MyService service;
String id;
public MyThread(String id) {
this.id = node;
}
public void run() {
User user = service.getUser(id)
addUser(user);
}
}
}
You could make a class which extends Observable. Then your thread can call a method in the Observable class which would notify any classes that registered in that observer by calling Observable.notifyObservers(Object).
The observing class would implement Observer, and register itself with the Observable. You would then implement an update(Observable, Object) method that gets called when Observerable.notifyObservers(Object) is called.

Categories