I have a Java Thread like the following:
public class MyThread extends Thread {
MyService service;
String id;
public MyThread(String id) {
this.id = node;
}
public void run() {
User user = service.getUser(id)
}
}
I have about 300 ids, and every couple of seconds - I fire up threads to make a call for each of the id. Eg.
for(String id: ids) {
MyThread thread = new MyThread(id);
thread.start();
}
Now, I would like to collect the results from each threads, and do a batch insert to the database, instead of making 300 database inserts every 2 seconds.
Any idea how I can accomplish this?
The canonical approach is to use a Callable and an ExecutorService. submitting a Callable to an ExecutorService returns a (typesafe) Future from which you can get the result.
class TaskAsCallable implements Callable<Result> {
#Override
public Result call() {
return a new Result() // this is where the work is done.
}
}
ExecutorService executor = Executors.newFixedThreadPool(300);
Future<Result> task = executor.submit(new TaskAsCallable());
Result result = task.get(); // this blocks until result is ready
In your case, you probably want to use invokeAll which returns a List of Futures, or create that list yourself as you add tasks to the executor. To collect results, simply call get on each one.
If you want to collect all of the results before doing the database update, you can use the invokeAll method. This takes care of the bookkeeping that would be required if you submit tasks one at a time, like daveb suggests.
private static final ExecutorService workers = Executors.newCachedThreadPool();
...
Collection<Callable<User>> tasks = new ArrayList<Callable<User>>();
for (final String id : ids) {
tasks.add(new Callable<User>()
{
public User call()
throws Exception
{
return svc.getUser(id);
}
});
}
/* invokeAll blocks until all service requests complete,
* or a max of 10 seconds. */
List<Future<User>> results = workers.invokeAll(tasks, 10, TimeUnit.SECONDS);
for (Future<User> f : results) {
User user = f.get();
/* Add user to batch update. */
...
}
/* Commit batch. */
...
Store your result in your object. When it completes, have it drop itself into a synchronized collection (a synchronized queue comes to mind).
When you wish to collect your results to submit, grab everything from the queue and read your results from the objects. You might even have each object know how to "post" it's own results to the database, this way different classes can be submitted and all handled with the exact same tiny, elegant loop.
There are lots of tools in the JDK to help with this, but it is really easy once you start thinking of your thread as a true object and not just a bunch of crap around a "run" method. Once you start thinking of objects this way programming becomes much simpler and more satisfying.
In Java8 there is better way for doing this using CompletableFuture. Say we have class that get's id from the database, for simplicity we can just return a number as below,
static class GenerateNumber implements Supplier<Integer>{
private final int number;
GenerateNumber(int number){
this.number = number;
}
#Override
public Integer get() {
try {
TimeUnit.SECONDS.sleep(1);
}catch (InterruptedException e){
e.printStackTrace();
}
return this.number;
}
}
Now we can add the result to a concurrent collection once the results of every future is ready.
Collection<Integer> results = new ConcurrentLinkedQueue<>();
int tasks = 10;
CompletableFuture<?>[] allFutures = new CompletableFuture[tasks];
for (int i = 0; i < tasks; i++) {
int temp = i;
CompletableFuture<Integer> future = CompletableFuture.supplyAsync(()-> new GenerateNumber(temp).get(), executor);
allFutures[i] = future.thenAccept(results::add);
}
Now we can add a callback when all the futures are ready,
CompletableFuture.allOf(allFutures).thenAccept(c->{
System.out.println(results); // do something with result
});
You need to store the result in a something like singleton. This has to be properly synchronized.
This not the best advice as it is not good idea to handle raw Threads.
You could create a queue or list which you pass to the threads you create, the threads add their result to the list which gets emptied by a consumer which performs the batch insert.
The simplest approach is to pass an object to each thread (one object per thread) that will contain the result later. The main thread should keep a reference to each result object. When all threads are joined, you can use the results.
public class TopClass {
List<User> users = new ArrayList<User>();
void addUser(User user) {
synchronized(users) {
users.add(user);
}
}
void store() throws SQLException {
//storing code goes here
}
class MyThread extends Thread {
MyService service;
String id;
public MyThread(String id) {
this.id = node;
}
public void run() {
User user = service.getUser(id)
addUser(user);
}
}
}
You could make a class which extends Observable. Then your thread can call a method in the Observable class which would notify any classes that registered in that observer by calling Observable.notifyObservers(Object).
The observing class would implement Observer, and register itself with the Observable. You would then implement an update(Observable, Object) method that gets called when Observerable.notifyObservers(Object) is called.
Related
Is it possible to modify the runnable object after it has been submitted to the executor service (single thread with unbounded queue) ?
For example:
public class Test {
#Autowired
private Runner userRunner;
#Autowired
private ExecutorService executorService;
public void init() {
for (int i = 0; i < 100; ++i) {
userRunner.add("Temp" + i);
Future runnerFuture = executorService.submit(userRunner);
}
}
}
public class Runner implements Runnable {
private List<String> users = new ArrayList<>();
public void add(String user) {
users.add(user);
}
public void run() {
/* Something here to do with users*/
}
}
As you can see in the above example, if we submit a runnable object and modify the contents of the object too inside the loop, will the 1st submit to executor service use the newly added users. Consider that the run method is doing something really intensive and subsequent submits are queued.
if we submit a runnable object and modify the contents of the object too inside the loop, will the 1st submit to executor service use the newly added users.
Only if the users ArrayList is properly synchronized. What you are doing is trying to modify the users field from two different threads which can cause exceptions and other unpredictable results. Synchronization ensures mutex so multiple threads aren't changing ArrayList at the same time unexpectedly, as well as memory synchronization which ensures that one thread's modifications are seen by the other.
What you could do is to add synchronization to your example:
public void add(String user) {
synchronized (users) {
users.add(user);
}
}
...
public void run() {
synchronized (users) {
/* Something here to do with users*/
}
}
Another option would be to synchronize the list:
// you can't use this if you are iterating on this list (for, etc.)
private List<String> users = Collections.synchronizedList(new ArrayList<>());
However, you'll need to manually synchronize if you are using a for loop on the list or otherwise iterating across it.
The cleanest, most straightforward approach would be to call cancel on the Future, then submit a new task with the updated user list. Otherwise not only do you face visibility issues from tampering with the list across threads, but there's no way to know if you're modifying a task that's already running.
What is the proper way to implement concurrency in Java applications? I know about Threads and stuff, of course, I have been programming for Java for 10 years now, but haven't had too much experience with concurrency.
For example, I have to asynchronously load a few resources, and only after all have been loaded, can I proceed and do more work. Needless to say, there is no order how they will finish. How do I do this?
In JavaScript, I like using the jQuery.deferred infrastructure, to say
$.when(deferred1,deferred2,deferred3...)
.done(
function(){//here everything is done
...
});
But what do I do in Java?
You can achieve it in multiple ways.
1.ExecutorService invokeAll() API
Executes the given tasks, returning a list of Futures holding their status and results when all complete.
2.CountDownLatch
A synchronization aid that allows one or more threads to wait until a set of operations being performed in other threads completes.
A CountDownLatch is initialized with a given count. The await methods block until the current count reaches zero due to invocations of the countDown() method, after which all waiting threads are released and any subsequent invocations of await return immediately. This is a one-shot phenomenon -- the count cannot be reset. If you need a version that resets the count, consider using a CyclicBarrier.
3.ForkJoinPool or newWorkStealingPool() in Executors is other way
Have a look at related SE questions:
How to wait for a thread that spawns it's own thread?
Executors: How to synchronously wait until all tasks have finished if tasks are created recursively?
I would use parallel stream.
Stream.of(runnable1, runnable2, runnable3).parallel().forEach(r -> r.run());
// do something after all these are done.
If you need this to be asynchronous, then you might use a pool or Thread.
I have to asynchronously load a few resources,
You could collect these resources like this.
List<String> urls = ....
Map<String, String> map = urls.parallelStream()
.collect(Collectors.toMap(u -> u, u -> download(u)));
This will give you a mapping of all the resources once they have been downloaded concurrently. The concurrency will be the number of CPUs you have by default.
If I'm not using parallel Streams or Spring MVC's TaskExecutor, I usually use CountDownLatch. Instantiate with # of tasks, reduce once for each thread that completes its task. CountDownLatch.await() waits until the latch is at 0. Really useful.
Read more here: JavaDocs
Personally, I would do something like this if I am using Java 8 or later.
// Retrieving instagram followers
CompletableFuture<Integer> instagramFollowers = CompletableFuture.supplyAsync(() -> {
// getInstaFollowers(userId);
return 0; // default value
});
// Retrieving twitter followers
CompletableFuture<Integer> twitterFollowers = CompletableFuture.supplyAsync(() -> {
// getTwFollowers(userId);
return 0; // default value
});
System.out.println("Calculating Total Followers...");
CompletableFuture<Integer> totalFollowers = instagramFollowers
.thenCombine(twitterFollowers, (instaFollowers, twFollowers) -> {
return instaFollowers + twFollowers; // can be replaced with method reference
});
System.out.println("Total followers: " + totalFollowers.get()); // blocks until both the above tasks are complete
I used supplyAsync() as I am returning some value (no. of followers in this case) from the tasks otherwise I could have used runAsync(). Both of these run the task in a separate thread.
Finally, I used thenCombine() to join both the CompletableFuture. You could also use thenCompose() to join two CompletableFuture if one depends on the other. But in this case, as both the tasks can be executed in parallel, I used thenCombine().
The methods getInstaFollowers(userId) and getTwFollowers(userId) are simple HTTP calls or something.
You can use a ThreadPool and Executors to do this.
https://docs.oracle.com/javase/tutorial/essential/concurrency/pools.html
This is an example I use Threads. Its a static executerService with a fixed size of 50 threads.
public class ThreadPoolExecutor {
private static final ExecutorService executorService = Executors.newFixedThreadPool(50,
new ThreadFactoryBuilder().setNameFormat("thread-%d").build());
private static ThreadPoolExecutor instance = new ThreadPoolExecutor();
public static ThreadPoolExecutor getInstance() {
return instance;
}
public <T> Future<? extends T> queueJob(Callable<? extends T> task) {
return executorService.submit(task);
}
public void shutdown() {
executorService.shutdown();
}
}
The business logic for the executer is used like this: (You can use Callable or Runnable. Callable can return something, Runnable not)
public class MultipleExecutor implements Callable<ReturnType> {//your code}
And the call of the executer:
ThreadPoolExecutor threadPoolExecutor = ThreadPoolExecutor.getInstance();
List<Future<? extends ReturnType>> results = new LinkedList<>();
for (Type Type : typeList) {
Future<? extends ReturnType> future = threadPoolExecutor.queueJob(
new MultipleExecutor(needed parameters));
results.add(future);
}
for (Future<? extends ReturnType> result : results) {
try {
if (result.get() != null) {
result.get(); // here you get the return of one thread
}
} catch (InterruptedException | ExecutionException e) {
logger.error(e, e);
}
}
The same behaviour as with $.Deferred in jQuery you can archive in Java 8 with a class called CompletableFuture. This class provides the API for working with Promises. In order to create async code you can use one of it's static creational methods like #runAsync, #supplyAsync. Then applying some computation of results with #thenApply.
I usually opt for an async notify-start, notify-progress, notify-end approach:
class Task extends Thread {
private ThreadLauncher parent;
public Task(ThreadLauncher parent) {
super();
this.parent = parent;
}
public void run() {
doStuff();
parent.notifyEnd(this);
}
public /*abstract*/ void doStuff() {
// ...
}
}
class ThreadLauncher {
public void stuff() {
for (int i=0; i<10; i++)
new Task(this).start();
}
public void notifyEnd(Task who) {
// ...
}
}
I'm currently unit testing my asynchronous methods using thread locking, usually I inject a CountDownLatch into my asynchronous component and let the main thread wait for it to reach 0. However, this approach just looks plain ugly, and it doesn't scale well, consider what happens when I write 100+ tests for a component and they all sequentially have to wait for a worker thread to do some fake asynchronous job.
So is there another approach? Consider the following example for a simple search mechanism:
Searcher.java
public class Searcher {
private SearcherListener listener;
public void search(String input) {
// Dispatch request to queue and notify listener when finished
}
}
SearcherListener.java
public interface SearcherListener {
public void searchFinished(String[] results);
}
How would you unit test the search method without using multiple threads and blocking one to wait for another? I've drawn inspiration from How to use Junit to test asynchronous processes but the top answer provides no concrete solution to how this would work.
Another approach:
Just dont start the thread. thats all.
Asume you have a SearcherService which uses your Searcher class.
Then don't start the async SearcherService, instead just call searcher.search(), which blocks until search is finished.
Searcher s = new Searcher();
s.search(); // blocks and returns when finished
// now somehow check the result
Writing unit test for async never looks nice.
It's necessary that the testMyAsyncMethod() (main thread) blocks until you are ready to check the correct behaviour. This is necessary because the test case terminates at the end of the method. So there is no way around, the question is only how you block.
A straightforward approach that does not influence much the productive code is to
use a while loop: asume AsyncManager is the class under test:
ArrayList resultTarget = new ArrayList();
AsyncManager fixture = new AsyncManager(resultTarget);
fixture.startWork();
// now wait for result, and avoid endless waiting
int numIter = 10;
// correct testcase expects two events in resultTarget
int expected = 2;
while (numIter > 0 && resulTarget.size() < expected) {
Thread.sleep(100);
numIter--;
}
assertEquals(expected, resulTarget.size());
productive code would use apropriate target in the constructor of AsyncManager or uses another constructor. For test purpose we can pass our test target.
You will write this only for inherent async tasks like your own message queue.
for other code, only unitest the core part of the class that performs the calculation task, (a special algorithm, etc) you dont need to let it run in a thread.
However for your search listener the shown principle with loop and wait is appropriate.
public class SearchTest extends UnitTest implements SearchListener {
public void searchFinished() {
this.isSearchFinished = true;
}
public void testSearch1() {
// Todo setup your search listener, and register this class to receive
Searcher searcher = new Searcher();
searcher.setListener(this);
// Todo setup thread
searcherThread.search();
asserTrue(checkSearchResult("myExpectedResult1"));
}
private boolean checkSearchResult(String expected) {
boolean isOk = false;
int numIter = 10;
while (numIter > 0 && !this.isSearchFinished) {
Thread.sleep(100);
numIter--;
}
// todo somehow check that search was correct
isOk = .....
return isOk;
}
}
Create a synchronous version of the class that listens for its own results and uses an internal latch that search() waits on and searchFinished() clears. Like this:
public static class SynchronousSearcher implements SearcherListener {
private CountDownLatch latch = new CountDownLatch(1);
private String[] results;
private class WaitingSearcher extends Searcher {
#Override
public void search(String input) {
super.search(input);
try {
latch.await();
} catch (InterruptedException e) {
throw new RuntimeException(e);
}
}
}
public String[] search(String input) {
WaitingSearcher searcher = new WaitingSearcher();
searcher.listener = this;
searcher.search(input);
return results;
}
#Override
public void searchFinished(String[] results) {
this.results = results;
latch.countDown();
}
}
Then to use it, simply:
String[] results = new SynchronousSearcher().search("foo");
There are no threads, no wait loops and the method returns in the minimal possible time. It also doesn't matter if the search returns instantly - before the call to await() - because await() will immediately return if the latch is already at zero.
How do I create a common variable between threads?
For example: Many threads sending a request to server to create users.
These users are saved in an ArrayList, but this ArrayList must be synchronized for all threads. How can I do it ?
Thanks all!
If you are going to access the list from multiple threads, you can use Collections to wrap it:
List<String> users = Collections.synchronizedList(new ArrayList<String>());
and then simply pass it in a constructor to the threads that will use it.
I would use an ExecutorService and submit tasks to it you want to perform. This way you don't need a synchronized collection (possibly don't need the collection at all)
However, you can do what you suggest by creating an ArrayList wrapped with a Collections.synchronizedList() and pass this as a reference to the thread before you start it.
What you could do is something like
// can be reused for other background tasks.
ExecutorService executor = Executors.newFixedThreadPool(numThreads);
List<Future<User>> userFutures = new ArrayList<>();
for( users to create )
userFutures.add(executor.submit(new Callable<User>() {
public User call() {
return created user;
}
});
List<User> users = new ArrayList<>();
for(Future<User> userFuture: userFutures)
users.add(userFuture.get();
To expand on #Peter's answer, if you use an ExecutorService you can submit a Callable<User> which can return the User that was created by the task run in another thread.
Something like:
// create a thread pool with 10 background threads
ExecutorService threadPool = Executors.newFixedThreadPool(10);
List<Future<User>> futures = new ArrayList<Future<User>>();
for (String userName : userNamesToCreateCollection) {
futures.add(threadPool.submit(new MyCallable(userName)));
}
// once you submit all of the jobs, we shutdown the pool, current jobs still run
threadPool.shutdown();
// now we wait for the produced users
List<User> users = new ArrayList<User>();
for (Future<User> future : futures) {
// this waits for the job to complete and gets the User created
// it also throws some exceptions that need to be caught/logged
users.add(future.get());
}
...
private static class MyCallable implements Callable<User> {
private String userName;
public MyCallable(String userName) {
this.userName = userName;
}
public User call() {
// create the user...
return user;
}
}
I'm new to Java programming. I have a use case where I have to execute 2 db queries parallely. The structure of my class is something like this:
class A {
public Object func_1() {
//executes db query1
}
public Object func_2() {
//executes db query1
}
}
Now I have a add another function func_3 in the same class which calls these 2 functions but also makes sure that they execute parallely. For this, I'm making use callables and futures. Is it the right way to use it this way? I'm storing the this variable in a temporary variable and then using this to call func_1 and func_2 from func_3(which I'm not sure is correct approach). Or is there any other way to handle cases like these?
class A {
public Object func_1() {
//executes db query1
}
public Object func_2() {
//executes db query1
}
public void func_3() {
final A that = this;
Callable call1 = new Callable() {
#Override
public Object call() {
return that.func_1();
}
}
Callable call2 = new Callable() {
#Override
public Object call() {
return that.func_2();
}
}
ArrayList<Callable<Object>> list = new ArrayList<Callable<Object>>();
list.add(call1);
list.add(call2);
ExecutorService executor = Executors.newFixedThreadPool(2);
ArrayList<Future<Object>> futureList = new ArrayList<Future<Object>>();
futureList = (ArrayList<Future<Object>>) executor.invokeAll(list);
//process result accordingly
}
}
First of all, you do NOT need to store this in another local variable: outer functions will be available just as func_1() or func_2() and when you want to get this of outer class you just use A.this.
Secondly, yes, it is common way to do it. Also, if you are going to call func_3 often - avoid creating of fixed thread pool, you should just pass it as params, since thread creation is rather 'costly'.
The whole idea of Executor(Service) is to use small number of threads for many small tasks. Here you use 2-threaded executor for 2 tasks. I would either create globally defined executor, or just spawn 2 threads for 2 tasks.