Are there any conceptual, functional or mechanical differences between a Scala Future and a Java Future? Conceptually I can't see any differences as they both aim to provide a mechanism for asynchronous computation.
The main inconvenience of java.util.concurrent.Future is the fact that you can't get the value without blocking.
In fact, the only way to retrieve a value is the get method, that (quoting from the docs)
Waits if necessary for the computation to complete, and then retrieves its result.
With scala.concurrent.Future you get instead a real non-blocking computation, as you can attach callbacks for completion (success/failure) or simply map over it and chain multiple Futures together in a monadic fashion.
Long story short, scala's Future allows for asynchronous computation without blocking more threads than necessary.
Java Future:Both represent result of an Asynchronous computation,but Java's Future requires that you access the result via a blocking get method.Although you can call isDone to find out if a Java Future has completed before calling get, thereby avoiding any blocking, you must wait until the Java Future has completed before proceeding with any computation that uses the result.
Scala Future: You can specify transformations on a Scala Future whether it has completed or not.Each transformation results in a new Future representing the asynchronous result of the original Future transformed by the function. This allows you to describe asynchronous computations as a series of transformations.
Scala's Future often eliminate, the need to reason about shared data and locks.When you invoke a Scala method, it performs a computation "while you wait" and returns a result. If that result is a Future, the Future represents another computation to be performed asynchronously often by a completely different thread.
Example: Instead of blocking then continuing with another computation, you can just map the next computation onto the future.
Following future will complete after ten seconds:
val fut = Future { Thread.sleep(10000);21+21}
Mapping this future with a function that increments by one will yield another future.
val result = fut.map(x=>x+1)
Related
In Java, is calling get() method while looping on CompletableFuture instances as good as doing synchronous operations although CompletableFuture is used for async calls?
'get()' waits until the future is completed. If that's what you want, it's what you use. There's no general rule.
For example, you might be using a method that is inherently asynchronous, but in your particular use, you need to wait for it to complete. If so, then there's nothing wrong with waiting for it to complete!
You mention a loop. You might find it applicable to start all the tasks in the loop, collecting a list of futures, and then (outside the loop) wait for them all to complete. That way you're getting some parallelism.
But as a general rule: it depends.
CompletableFuture allows to provide callbacks for async calls. You can create a long chain of callbacks where each async call will trigger the next one on completion. This is deemed a better way to write async code instead of using Future where you've to block the thread to get the result of first computation before triggering the next one.
I can understand the argument that callback chains in Completable Futures can provide a more readable code but I'm wondering if there's a performance benefit as well to this approach or is it just a syntactic sugar?
For example, consider the following code:
ExecutorService exec = Executors.newSingleThreadExecutor();
CompletableFuture.supplyAsync(this::findAccountNumber, exec)
.thenApply(this::calculateBalance)
.thenApply(this::notifyBalance)
.thenAccept((i)->notifyByEmail())
.join();
In this code, calculateBalance() can't start until findAccountNumber() finishes so essentially calculateBalance() is blocked on findAccountNumber() and so on for the next methods in the callback chain. How is it better than the following (performance-wise):
ExecutorService exec = Executors.newSingleThreadExecutor();
Future<Integer> accountNumberFuture = exec.submit(findAccountNumberCallable);
Integer accountNumber = accountNumberFuture.get();
Future<String> calculateBalanceFuture = exec.submit(calculateBalanceCallable(accountNumber);
....
....
In most cases, you won't notice a difference, but if you want to be able to have a lot of concurrent asynchronous calls waiting something, you'll want to use CompletableFuture.
The reason is that if you simply call get() on a regular Future the Thread and all resources associated with it become blocked until the call returns. If you have many calls your thread pool might get exhausted, or if you use a CachedThreadPool you might cause lots of threads to be created.
With CompletableFuture, an object is stored on the heap which represents where the application should pick up next, as opposed to using the call stack. The guy who built the API has a talk about it over here.
I (mostly) understand the three execution methods of CompletableFuture:
non-async (synchronous execution)
default async (asynchronous using the default executor)
custom async (asynchronous using a custom executor)
My question is: when should one favor the use of non-async methods?
What happens if you have a code block that invokes other methods that also return CompletableFutures? This might look cheap on the surface, but what happens if those methods also use non-async invocation? Doesn't this add up to one long non-async block that could get expensive?
Should one restrict the use of non-async execution to short, well-defined code-blocks that do not invoke other methods?
When should one favor the use of non-async methods?
The decision for continuations is no different than for the antecedent task itself. When do you choose to make an operation asynchronous (e.g., using a CompletableFuture) vs. writing purely synchronous code? The same guidance applies here.
If you are simply consuming the result or using the completion signal to kick off another asynchronous operation, then that itself is a cheap operation, and there is no reason not to use the synchronous completion methods.
On the other hand, if you are chaining together multiple long-running operations that would each be an async operation in their own right, then use the async completion methods.
If you're somewhere in between, trust your gut, or just go with the async completion methods. If you're not coordinating thousands of tasks, then you won't be adding a whole lot of overhead.
Should one restrict the use of non-async execution to short, well-defined code-blocks that do not invoke other methods?
I would use them for operations that are not long-running. You don't need to restrict their use to trivially short and simple callbacks. But I think you have the right idea.
If you're using CompletableFuture, then you have decided that at least some operations in your code base necessitate async execution, but presumably not all operations are async. How did you decide which should be async and which should not? If you apply that same analysis to continuations, I think you'll be fine.
What happens if you have a code block that invokes other methods that also return CompletableFutures? This might look cheap on the surface, but what happens if those methods also use non-async invocation? Doesn't this add up to one long non-async block that could get expensive?
Returning a CompletableFuture generally signifies that the underlying operation is scheduled to occur asynchronously, so that should not be a problem. In most cases, I would expect the flow to look something like this:
You synchronously call an async method returning a CompletableFuture. It schedules some async operation to eventually provide a result. Your call returns almost immediately, with no blocking.
Upon completion, one or more continuations may be invoked. Some of those may invoke additional async operations. Those will call into methods that will schedule additional async operations, but as before, they return almost immediately.
Go to (2), or finish.
To run some stuff in parallel or asynchronously I can use either an ExecutorService: <T> Future<T> submit(Runnable task, T result); or the CompletableFuture Api:static <U> CompletableFuture<U> supplyAsync(Supplier<U> supplier, Executor executor);
(Lets assume I use in both cases the same Executor)
Besides the return type Future vs. CompletableFuture are there any remarkable differences. Or When to use what?
And what are the differences if I use the CompletableFuture API with default Executor (the method without executor)?
Besides the return type Future vs. CompletableFuture are there any remarkable differences. Or When to use what?
It's rather simple really. You use the Future when you want the executing thread to wait for async computation response. An example of this is with a parallel merge/sort. Sort left asynchronously, sort right synchronously, wait on left to complete (future.get()), merge results.
You use a CompleteableFuture when you want some action executed, with the result after completion, asynchronously from the executed thread. For instance: I want to do some computation asynchronously and when I compute, write the results to some system. The requesting thread may not need to wait on a result then.
You can mimic the above example in a single Future executable, but the CompletableFuture offers a more fluent interface with better error handling.
It really depends on what you want to do.
And what are the differences if i use the CompletableFutureApi with default Executor (the method without executor)?
It will delegate to ForkJoin.commonPool() which is a default size to the number of CPUs on your system. If you are doing something IO intensive (reading and writing to the file system) you should define the thread pool differently.
If it's CPU intensive, using the commonPool makes most sense.
CompletableFuture has rich features like chaining multiple futures, combining the futures, executing some action after future is executed (both synchronously as well as asynchronously), etc.
However, CompletableFuture is no different than Future in terms of performance. Even when combine multiple instances of CompletableFuture (using .thenCombine and .join in the end), none of them get executed unless we call .get method and during this time, the invoking thread is blocked. I feel in terms of performance, this is not better than Future.
Please let me know if I am missing some aspect of performance here.
This clarified for me the difference between future an completable future a bit more: Difference between Future and Promise
CompletableFuture is more like a promise.
After doing lots of searching on Java, I really am very confused over the following questions:
Why would I choose an asynchronous method over a multi-threaded method?
Java futures are supposed to be non-blocking. What does non-blocking mean? Why call it non-blocking when the method to extract information from a Future--i.e., get()--is blocking and will simply halt the entire thread till the method is done processing? Perhaps a callback method that rings the church bell of completion when processing is complete?
How do I make a method async? What is the method signature?
public List<T> databaseQuery(String Query, String[] args){
String preparedQuery = QueryBaker(Query, args);
List<int> listOfNumbers = DB_Exec(preparedQuery); // time taking task
return listOfNumbers;
}
How would this fictional method become a non blocking method? Or if you want please provide a simple synchronous method and an asynchronous method version of it.
Why would I choose an asynchronous method over a multi-threaded method?
Asynchronous methods allow you to reduce the number of threads. Instead of tying up a thread in a blocking call, you can issue an asynchronous call and then be notified later when it completes. This frees up the thread to do other processing in the meantime.
It can be more convoluted to write asynchronous code, but the benefit is improved performance and memory utilization.
Java futures are supposed to be non-blocking. What does non-blocking mean? Why call it non-blocking when the method to extract information from a Future--i.e., get()--is blocking and will simply halt the entire thread till the method is done processing ? Perhaps a callback method that rings the church bell of completion when processing is complete?
Check out CompletableFuture, which was added in Java 8. It is a much more useful interface than Future. For one, it lets you chain all kinds of callbacks and transformations to futures. You can set up code that will run once the future completes. This is much better than blocking in a get() call, as you surmise.
For instance, given asynchronous read and write methods like so:
CompletableFuture<ByteBuffer> read();
CompletableFuture<Integer> write(ByteBuffer bytes);
You could read from a file and write to a socket like so:
file.read()
.thenCompose(bytes -> socket.write(bytes))
.thenAccept(count -> log.write("Wrote {} bytes to socket.", count)
.exceptionally(exception -> {
log.error("Input/output error.", exception);
return null;
});
How do I make a method async? What is the method signature?
You would have it return a future.
public CompletableFuture<List<T>> databaseQuery(String Query, String[] args);
It's then the responsibility of the method to perform the work in some other thread and avoid blocking the current thread. Sometimes you will have worker threads ready to go. If not, you could use the ForkJoinPool, which makes background processing super easy.
public CompletableFuture<List<T>> databaseQuery(String query, String[] args) {
CompletableFuture<List<T>> future = new CompletableFuture<>();
Executor executor = ForkJoinPool.commonPool();
executor.execute(() -> {
String preparedQuery = QueryBaker(Query, args);
List<T> list = DB_Exec(preparedQuery); // time taking task
future.complete(list);
});
}
why would I choose a Asynchronous method over a multi-threaded method
They sound like the same thing to me except asynchronous sounds like it will use one thread in the back ground.
Java futures is supposed to be non blocking ?
Non- blocking operations often use a Future, but the object itself is blocking, though only when you wait on it.
What does Non blocking mean?
The current thread doesn't wait/block.
Why call it non blocking when the method to extract information from a Future < some-object > i.e. get() is blocking
You called it non-blocking. Starting the operation in the background is non-blocking, but if you need the results, blocking is the easiest way to get this result.
and will simply halt the entire thread till the method is done processing ?
Correct, it will do that.
Perhaps a callback method that rings the church bell of completion when processing is complete ?
You can use a CompletedFuture, or you can just add to the task anything you want to do at the end. You only need to block on things which have to be done in the current thread.
You need to return a Future, and do something else while you wait, otherwise there is no point using a non-blocking operation, you may as well execute it in the current thread as it's simpler and more efficient.
You have the synchronous version already, the asynchronous version would look like
public Future<List<T>> databaseQuery(String Query, String[] args) {
return executor.submit(() -> {
String preparedQuery = QueryBaker(Query, args);
List<int> listOfNumbers = DB_Exec(preparedQuery); // time taking task
return listOfNumbers;
});
}
I'm not a guru on multithreading but I'm gonna try to answer these questions for my sake as well
why would I choose a Asynchronous method over a multi-threaded method ? (My problem: I believe I read too much and now I am myself confused)`
Multi-threading is working with multiple threads, there isn't much else to it. One interesting concept is that multiple threads cannot work in a truly parallel fashion and thus divides each thread into small bits to give the illusion of working in parallel.
1
One example where multithreading would be useful is in real-time multiplayer games, where each thread corresponds to each user. User A would use thread A and User B would use thread B. Each thread could track each user's activity and data could be shared between each thread.
2
Another example would be waiting for a long http call. Say you're designing a mobile app and the user clicks on download for a file of 5 gigabytes. If you don't use multithreading, the user would be stuck on that page without being able to perform any action until the http call completes.
It's important to note that as a developer multithreading is only a way of designing code. It adds complexity and doesn't always have to be done.
Now for Async vs Sync, Blocking vs Non-blocking
These are some definitions I found from http://doc.akka.io/docs/akka/2.4.2/general/terminology.html
Asynchronous vs. Synchronous
A method call is considered synchronous if the caller cannot make progress until the method returns a value or throws an exception. On the other hand, an asynchronous call allows the caller to progress after a finite number of steps, and the completion of the method may be signalled via some additional mechanism (it might be a registered callback, a Future, or a message).
A synchronous API may use blocking to implement synchrony, but this is not a necessity. A very CPU intensive task might give a similar behavior as blocking. In general, it is preferred to use asynchronous APIs, as they guarantee that the system is able to progress. Actors are asynchronous by nature: an actor can progress after a message send without waiting for the actual delivery to happen.
Non-blocking vs. Blocking
We talk about blocking if the delay of one thread can indefinitely delay some of the other threads. A good example is a resource which can be used exclusively by one thread using mutual exclusion. If a thread holds on to the resource indefinitely (for example accidentally running an infinite loop) other threads waiting on the resource can not progress. In contrast, non-blocking means that no thread is able to indefinitely delay others.
Non-blocking operations are preferred to blocking ones, as the overall progress of the system is not trivially guaranteed when it contains blocking operations.
I find that async vs sync refers more to the intent of the call whereas blocking vs non-blocking refers to the result of the call. However, it wouldn't be wrong to say usually asynchronous goes with non-blocking and synchronous goes with blocking.
2> Java futures is supposed to be non blocking ? What does Non blocking mean? Why call it non blocking when the method to extract information from a Future < some-object > i.e. get() is blocking and will simply halt the entire thread till the method is done processing ? Perhaps a callback method that rings the church bell of completion when processing is complete ?
Non-blocking do not block the thread that calls the method.
Futures were introduced in Java to represent the result of a call, although it may have not been complete. Going back to the http file example, Say you call a method like the following
Future<BigData> future = server.getBigFile(); // getBigFile would be an asynchronous method
System.out.println("This line prints immediately");
The method getBigFile would return immediately and proceed to the next line of code. You would later be able to retrieve the contents of the future (or be notified that the contents are ready). Libraries/Frameworks like Netty, AKKA, Play use Futures extensively.
How do I make a method Async? What is the method signature?
I would say it depends on what you want to do.
If you want to quickly build something, you would use high level functions like Futures, Actor models, etc. something which enables you to efficiently program in a multithreaded environment without making too many mistakes.
On the other hand if you just want to learn, I would say it's better to start with low level multithreading programming with mutexes, semaphores, etc.
Examples of codes like these are numerous in google if you just search java asynchronous example with any of the keywords I have written.
Let me know if you have any other questions!