Java Threads calling methods on common Data Collector Object possible? - java

The idea: I have a JAX-RS webservice servlet (Object called webServlet) which instantiates a data collecting Object dataCollector and passes this object on to multiple threads in their constructor. These threads query websites for results and then call the dataCollector.add(result) method to add the results to a Queue within the shared dataCollector.
I have two questions regarding this idea:
1) Can multiple threads call methods of a single shared object at the same time?
2) How does my webServlet object check when all threads are terminated to render a result page? Do I have to let my webServlet wait while all threads are running so I have a complete result list and how would I do that?

1) Yes, but perhaps not safely. In particular, if the queue in your dataCollector isn't a thread-safe queue like a ConcurrentLinkedQueue, you run the risk of a ConcurrentModificationException when a thread calls add() on it.
2) a) Use an ExecutorService (perhaps obtained from Executors) to submit Callables or Runnables. Keep the Futures that are returned and use get() to wait until he work is done.
b) You don't have to. The choice is up to you. If you send the response before the work is done, you obviously won't have a complete result yet.
c) See a).
If this is all new to you, you may want to check out Concurrency in the Java Tutorials.

Related

Multiple CompletionService for one thread pool Java

I'm working on a Java server application with the general following architecture:
Clients make RPC requests to the server
The RPC server (gRPC) I believe has its own thread pool for handling requests
Requests are immediately inserted into Thread Pool 1 for more processing
A specific request type, we'll call Request R, needs to run a few asynchronous tasks in parallel, judging the results to form a consensus that it will return to the client. These tasks are a bit more long running, so I use a separate Thread Pool 2 to handle these requests. Importantly, each Request R will need to run the same 2-3 asynchronous tasks. Thread Pool 2 therefore services ALL currently executing Request R's. However, a Request R should only be able to see and retrieve the asynchronus tasks that belong to it.
To achieve this, upon every incoming Request R, while its in Thread Pool 1, it will create a new CompletionService for the request, backed by Thread Pool 2. It will submit 2-3 async tasks, and retrieve the results. These should be strictly isolated from anything else that might be running in Thread Pool 2 belonging to other requests.
My questions:
Firstly, is Java's CompletionService isolated? I couldn't find good documentation on this after checking the JavaDocs. In other words, if two or more CompletionService's are backed by the same thread pool, are any of them at risk of pulling a future belonging to another CompletionService?
Secondly, is this bad practice to be creating this many CompletionService's for each request? Is there a better way to handle this? Of course it would be a bad idea to create a new thread pool for each request, so is there a more canonical/correct way to isolate futures within a CompletionService or is what I'm doing okay?
Thanks in advance for the help. Any pointers to helpful documentation or examles would be greatly appreciated.
Code, for reference, although trivial:
public static final ExecutorService THREAD_POOL_2 =
new ThreadPoolExecutor(16, 64, 60, TimeUnit.SECONDS, new LinkedBlockingQueue<>());
// Gets created to handle a RequestR, RequestRHandler is run in Thread Pool 1
public class RequestRHandler {
CompletionService<String> cs;
RequestRHandler() {
cs = new ExecutorCompletionService<>(THREAD_POOL_2);
}
String execute() {
cs.submit(asyncTask1);
cs.submit(asyncTask2);
cs.submit(asyncTask3);
// Lets say asyncTask3 completes first
Future<String> asyncTask3Result = cs.take();
// asyncTask3 result indicates asyncTask1 & asyncTask2 results don't matter, cancel them
// without checking result
// Cancels all futures, I track all futures submitted within this request and cancel them,
// so it shouldn't affect any other requests in the TP 2 pool
cancelAllFutures(cs);
return asyncTask3Result.get();
}
}
Firstly, is Java's CompletionService isolated?
That's not garanteed as it's an interface, so the implementation decides that. But as the only implementation is ExecutorCompletionService I'd just say the answer is: yes. Every instance of ExecutorCompletionService has internally a BlockingQueue where the finished tasks are queued. Actually, when you call take on the service, it just passes the call to the queue by calling take on it. Every submitted task is wrapped by another object, which puts the task in the queue when it's finished. So each instance manages it's submitted tasks isolated from other instances.
Secondly, is this bad practice to be creating this many CompletionServices for each request?
I'd say it's okay. A CompletionService is nothing but a rather thin wrapper around an executor. You have to live with the "overhead" (internal BlockingQueue and wrapper instances for the tasks) but it's small and you are probably gaining way more from it than it costs. One could ask if you need one for just 2 to 3 tasks but it kinda depends on the tasks. At this point it's a question about if a CompletionService is worth it in general, so that's up to you to decide as it's out of scope of your question.

Java Async is blocking?

After doing lots of searching on Java, I really am very confused over the following questions:
Why would I choose an asynchronous method over a multi-threaded method?
Java futures are supposed to be non-blocking. What does non-blocking mean? Why call it non-blocking when the method to extract information from a Future--i.e., get()--is blocking and will simply halt the entire thread till the method is done processing? Perhaps a callback method that rings the church bell of completion when processing is complete?
How do I make a method async? What is the method signature?
public List<T> databaseQuery(String Query, String[] args){
String preparedQuery = QueryBaker(Query, args);
List<int> listOfNumbers = DB_Exec(preparedQuery); // time taking task
return listOfNumbers;
}
How would this fictional method become a non blocking method? Or if you want please provide a simple synchronous method and an asynchronous method version of it.
Why would I choose an asynchronous method over a multi-threaded method?
Asynchronous methods allow you to reduce the number of threads. Instead of tying up a thread in a blocking call, you can issue an asynchronous call and then be notified later when it completes. This frees up the thread to do other processing in the meantime.
It can be more convoluted to write asynchronous code, but the benefit is improved performance and memory utilization.
Java futures are supposed to be non-blocking. What does non-blocking mean? Why call it non-blocking when the method to extract information from a Future--i.e., get()--is blocking and will simply halt the entire thread till the method is done processing ? Perhaps a callback method that rings the church bell of completion when processing is complete?
Check out CompletableFuture, which was added in Java 8. It is a much more useful interface than Future. For one, it lets you chain all kinds of callbacks and transformations to futures. You can set up code that will run once the future completes. This is much better than blocking in a get() call, as you surmise.
For instance, given asynchronous read and write methods like so:
CompletableFuture<ByteBuffer> read();
CompletableFuture<Integer> write(ByteBuffer bytes);
You could read from a file and write to a socket like so:
file.read()
.thenCompose(bytes -> socket.write(bytes))
.thenAccept(count -> log.write("Wrote {} bytes to socket.", count)
.exceptionally(exception -> {
log.error("Input/output error.", exception);
return null;
});
How do I make a method async? What is the method signature?
You would have it return a future.
public CompletableFuture<List<T>> databaseQuery(String Query, String[] args);
It's then the responsibility of the method to perform the work in some other thread and avoid blocking the current thread. Sometimes you will have worker threads ready to go. If not, you could use the ForkJoinPool, which makes background processing super easy.
public CompletableFuture<List<T>> databaseQuery(String query, String[] args) {
CompletableFuture<List<T>> future = new CompletableFuture<>();
Executor executor = ForkJoinPool.commonPool();
executor.execute(() -> {
String preparedQuery = QueryBaker(Query, args);
List<T> list = DB_Exec(preparedQuery); // time taking task
future.complete(list);
});
}
why would I choose a Asynchronous method over a multi-threaded method
They sound like the same thing to me except asynchronous sounds like it will use one thread in the back ground.
Java futures is supposed to be non blocking ?
Non- blocking operations often use a Future, but the object itself is blocking, though only when you wait on it.
What does Non blocking mean?
The current thread doesn't wait/block.
Why call it non blocking when the method to extract information from a Future < some-object > i.e. get() is blocking
You called it non-blocking. Starting the operation in the background is non-blocking, but if you need the results, blocking is the easiest way to get this result.
and will simply halt the entire thread till the method is done processing ?
Correct, it will do that.
Perhaps a callback method that rings the church bell of completion when processing is complete ?
You can use a CompletedFuture, or you can just add to the task anything you want to do at the end. You only need to block on things which have to be done in the current thread.
You need to return a Future, and do something else while you wait, otherwise there is no point using a non-blocking operation, you may as well execute it in the current thread as it's simpler and more efficient.
You have the synchronous version already, the asynchronous version would look like
public Future<List<T>> databaseQuery(String Query, String[] args) {
return executor.submit(() -> {
String preparedQuery = QueryBaker(Query, args);
List<int> listOfNumbers = DB_Exec(preparedQuery); // time taking task
return listOfNumbers;
});
}
I'm not a guru on multithreading but I'm gonna try to answer these questions for my sake as well
why would I choose a Asynchronous method over a multi-threaded method ? (My problem: I believe I read too much and now I am myself confused)`
Multi-threading is working with multiple threads, there isn't much else to it. One interesting concept is that multiple threads cannot work in a truly parallel fashion and thus divides each thread into small bits to give the illusion of working in parallel.
1
One example where multithreading would be useful is in real-time multiplayer games, where each thread corresponds to each user. User A would use thread A and User B would use thread B. Each thread could track each user's activity and data could be shared between each thread.
2
Another example would be waiting for a long http call. Say you're designing a mobile app and the user clicks on download for a file of 5 gigabytes. If you don't use multithreading, the user would be stuck on that page without being able to perform any action until the http call completes.
It's important to note that as a developer multithreading is only a way of designing code. It adds complexity and doesn't always have to be done.
Now for Async vs Sync, Blocking vs Non-blocking
These are some definitions I found from http://doc.akka.io/docs/akka/2.4.2/general/terminology.html
Asynchronous vs. Synchronous
A method call is considered synchronous if the caller cannot make progress until the method returns a value or throws an exception. On the other hand, an asynchronous call allows the caller to progress after a finite number of steps, and the completion of the method may be signalled via some additional mechanism (it might be a registered callback, a Future, or a message).
A synchronous API may use blocking to implement synchrony, but this is not a necessity. A very CPU intensive task might give a similar behavior as blocking. In general, it is preferred to use asynchronous APIs, as they guarantee that the system is able to progress. Actors are asynchronous by nature: an actor can progress after a message send without waiting for the actual delivery to happen.
Non-blocking vs. Blocking
We talk about blocking if the delay of one thread can indefinitely delay some of the other threads. A good example is a resource which can be used exclusively by one thread using mutual exclusion. If a thread holds on to the resource indefinitely (for example accidentally running an infinite loop) other threads waiting on the resource can not progress. In contrast, non-blocking means that no thread is able to indefinitely delay others.
Non-blocking operations are preferred to blocking ones, as the overall progress of the system is not trivially guaranteed when it contains blocking operations.
I find that async vs sync refers more to the intent of the call whereas blocking vs non-blocking refers to the result of the call. However, it wouldn't be wrong to say usually asynchronous goes with non-blocking and synchronous goes with blocking.
2> Java futures is supposed to be non blocking ? What does Non blocking mean? Why call it non blocking when the method to extract information from a Future < some-object > i.e. get() is blocking and will simply halt the entire thread till the method is done processing ? Perhaps a callback method that rings the church bell of completion when processing is complete ?
Non-blocking do not block the thread that calls the method.
Futures were introduced in Java to represent the result of a call, although it may have not been complete. Going back to the http file example, Say you call a method like the following
Future<BigData> future = server.getBigFile(); // getBigFile would be an asynchronous method
System.out.println("This line prints immediately");
The method getBigFile would return immediately and proceed to the next line of code. You would later be able to retrieve the contents of the future (or be notified that the contents are ready). Libraries/Frameworks like Netty, AKKA, Play use Futures extensively.
How do I make a method Async? What is the method signature?
I would say it depends on what you want to do.
If you want to quickly build something, you would use high level functions like Futures, Actor models, etc. something which enables you to efficiently program in a multithreaded environment without making too many mistakes.
On the other hand if you just want to learn, I would say it's better to start with low level multithreading programming with mutexes, semaphores, etc.
Examples of codes like these are numerous in google if you just search java asynchronous example with any of the keywords I have written.
Let me know if you have any other questions!

ExecutorService-like class where user controls when Callables are called

I was using an ExecutorService to schedule tasks to be executed in future. After seeing some "odd" behavior where my Callable was getting executed before I called get() on the Future object returned by submitting my Callable to the ExecutorService pool, I read some documentation and found that the submitted task will get executed between the time it gets submitted or at the latest when get() is called on the Future object.
My question - is there any class that would allow Callables to be submitted to it and ONLY executed when get() is called on it? At this point, it seems like just managing the Callables myself and calling call() on them myself when I am ready for them to be executed seems like it'd accomplish what I want, but I wanted to make sure there was no service already implemented that accomplished this.
In short, is there any alternative to ExecutorService that lets me control when Callables submitted to it are called? Note - the time in the future that I want them called is variable and not determined as I may decide not to call them so a ScheduledExecutorService pool won't work here.
Thanks much!
Sounds like you really want to use a Queue<Callable> instead and just poll the queue for tasks.
That way you can submit as many tasks as you like and execute them at your will - one by one.

Calling Different Webservices in parallel from Webapp

We've got a stipes (java) web-app that needs to make about 15 different webserivce calls all from one method. For example:
...
public Resolution userProfile()
{
serviceOneCall();
serviceTwoCall();
serviceThreeCall();
serviceFourCall();
....
serviceTenCall();
return new RedirectResolution("profiel.jsp");
}
All of these can be called in parallel and are not dependent on each other. The one thing that most all of these calls are doing is putting data in the session, and one or two may put data into the same object that is in the session, so thread safety is probably a concern there.
Can anyone suggest a good way of calling all of these concurrently?
All solutions to doing this work in parallel is going to involve spawning new threads or submitting jobs to a thread pool for the remote network calls to happen to.
A good way to avoid thread safety problems is to use an executorService and submit subclasses of Callable<T> (to either the submit(Callable) or invokeAll(Collection<Callable>) methods) and have the Callables return the response value. This way your initial method can simply handle the return values of each call and choose to set the responses in the session or update other objects, rather than this work occurring in another thread.
So basic algorithm:
Submit each of these calls to an executorService in Callable<T> subclasses
Collect the Future<T>s you get back from the executorService
Call Future.get() on each to block until you have a response, and then process the responses however you wish back on the main thread
Use an ExecutorService with a thread pool to submit Callables for each WS you need to call, and synchronize on the object which is updated when there is a chance of concurrent modification.
You may want to use Guava's concurrent extensions for an easier management of the Futures, using for example Futures.allAsList() which will convert a List<Future<T>> into a Future<List<T>>, so you only have one get() to do to wait for all the answers.
for (i = 0; i <= numOfServiceCalls; i++) {
new Thread(new Runnable() {
switch(i) {
case 1 : serviceOneCall();
break();
case 2 : serviceTwoCall();
break();
// Keep going with as many cases as you have.
}
});
}

Get Runnable objects I scheduled using ScheduledThreadPoolExecutor when using shutdownNow() method

I'm using ScheduledThreadPoolExecutor.schedule(Runnable,int,TimeUnit) to schedule some objects of a class that implements Runnable.
At some point in time, my application is then shutting down, and I use ScheduledThreadPoolExecutor.shutdownNow(). According to the documentation it returns a list of ScheduledFuture's.
What I really want to do is get a hold of the object that I originally scheduled, and get a bit of data from it which I will then output saying that it was unable to execute. This would then, later, be used by the application to attempt to execute it when the application then starts back up.
The usual way to get info about objects submitted to an executor is to maintain a pointer to the original objects (be they extensions to Callable, Runnable, etc). After you call shutdownNow(), and take into account the Runnable's returned by that which were awaiting execution, you can use that to prune your original list of objects and just interrogate the ones that were actually run.
If you just want to present the information to the user, the simplest approach might be to implement a meaningful toString() method for the Runnables you'r scheduling. Then you can simply iterate the list the Executor gives you and log what you get.
But the sad truth is that your original objects get wrapped by the Executor, though. Then you would need to keep a list of what you pass to the Executor manually and let the Runnables remove themselves from this list when they get executed. Obviously, you would need to use a thread-safe list for this purpose.

Categories