In my application I am using a third-party API. It is a non-blocking method which returns immediately. I have a collection of elements over which I have to invoke this method.
Now, my problem is that I have to find a way till all the method execution gets completed and do my next operation. How can I handle this? I cannot modify the third-party API.
In short it looks like this
for(Object object: objects){
methodA(object); //this is a non-blocking call and returns immediately
}
// here I want to do my next task only after all the methodA calls completed execution
What you are asking for is impossible ... unless the third party API also includes some method that allows you to wait until one or more calls to methodA have completed.
Does it?
EDIT
As Kathy Stone notes, another possibility is that the API might have a callback mechanism, whereby a thread (behind the API) that is doing the work started by the methodA call "calls back" to your code. (There would need to be some other method in the API that allows you to register the callback object.)
There are other possibilities as well ... (some too horrible to mention) ... but they all entail the API being designed to support synchronization with end of the tasks started by methodA.
As Stephen noted it is possible if you have some way of knowing that the method has completed. If you have some kind of callback or listener for this you could use something like a counting semaphore:
final Semaphore block = new Semaphore();
//HERE SOMETHING APPROPRIATE TO YOUR API
myAPI.registerListener(new APIListener(){
public void methodADone(){
block.release();
}
});
int permits = 0;
for(Object object: objects){
methodA(object); //this is a non-blocking call and returns immediately
permits++;
}
block.acquire(permits);
Of course you would need extra checking to make sure you are releasing permits for the correct object collections, depending on how your code is threaded and what mechanism the API provides to know the call has completed, but this is one approach that could be used.
How do you dertermine a methodA() call has finished?
Does the method return any handle? Or do the object has any property to be set by the methodA() call? So collect them an do a loop with sleep and check all remaining handles or object properties, each removed from the remaining if completed.
The waiting code cann look like:
while(!remaining.isEmpty()) {
try {
Thread.sleep(100);
} catch (InterruptedException e) {
continue;
}
Iterator<HandleOrObjectWithProperty> i = remaining.iterator();
while (i.hasNext()) {
HandleOrObjectWithProperty result = i.next();
if (result.handleHasFinishedOrPropertyIsSet()) {
i.remove();
}
}
}
Related
I have a blocking queue of objects.
I want to write a thread that blocks till there is a object on the queue. Similar to the functionality provided by BlockingQueue.take().
However, since I do not know if I will be able to process the object successfully, I want to just peek() and not remove the object. I want to remove the object only if I am able to process it successfully.
So, I would like a blocking peek() function. Currently, peek() just returns if the queue is empty as per the javadocs.
Am I missing something? Is there another way to achieve this functionality?
EDIT:
Any thoughts on if I just used a thread safe queue and peeked and slept instead?
public void run() {
while (!exit) {
while (queue.size() != 0) {
Object o = queue.peek();
if (o != null) {
if (consume(o) == true) {
queue.remove();
} else {
Thread.sleep(10000); //need to backoff (60s) and try again
}
}
}
Thread.sleep(1000); //wait 1s for object on queue
}
}
Note that I only have one consumer thread and one (separate) producer thread. I guess this isn't as efficient as using a BlockingQueue... Any comments appreciated.
You could use a LinkedBlockingDeque and physically remove the item from the queue (using takeLast()) but replace it again at the end of the queue if processing fails using putLast(E e). Meanwhile your "producers" would add elements to the front of the queue using putFirst(E e).
You could always encapsulate this behaviour within your own Queue implementation and provide a blockingPeek() method that performs takeLast() followed by putLast() behind the scenes on the underlying LinkedBlockingDeque. Hence from the calling client's perspective the element is never removed from your queue.
However, since I do not know if I will be able to process the object successfully, I want to just peek() and not remove the object. I want to remove the object only if I am able to process it successfully.
In general, it is not thread-safe. What if, after you peek() and determine that the object can be processed successfully, but before you take() it to remove and process, another thread takes that object?
Could you also just add an event listener queue to your blocking queue, then when something is added to the (blocking)queue, send an event off to your listeners? You could have your thread block until it's actionPerformed method was called.
The only thing I'm aware of that does this is BlockingBuffer in Apache Commons Collections:
If either get or remove is called on
an empty Buffer, the calling thread
waits for notification that an add or
addAll operation has completed.
get() is equivalent to peek(), and a Buffer can be made to act like BlockingQueue by decorating a UnboundedFifoBuffer with a BlockingBuffer
The quick answer is, not there's not really a way have a blocking peek, bar implementing a blocking queue with a blocking peek() yourself.
Am I missing something?
peek() can be troublesome with concurrency -
If you can't process your peek()'d message - it'll be left in the queue, unless you have multiple consumers.
Who is going to get that object out of the queue if you can't process it ?
If you have multiple consumers, you get a race condition between you peek()'ing and another thread also processing items, resulting in duplicate processing or worse.
Sounds like you might be better off actually removing the item and process it using a
Chain-of-responsibility pattern
Edit: re: your last example: If you have only 1 consumer, you will never get rid of the object on the queue - unless it's updated in the mean time - in which case you'd better be very very careful about thread safety and probably shouldn't have put the item in the queue anyway.
Not an answer per se, but: JDK-6653412 claims this is not a valid use case.
Looks like BlockingQueue itself doesn't have the functionality you're specifying.
I might try to re-frame the problem a little though: what would you do with objects you can't "process correctly"? If you're just leaving them in the queue, you'll have to pull them out at some point and deal with them. I'd reccommend either figuring out how to process them (commonly, if a queue.get() gives any sort of invalid or bad value, you're probably OK to just drop it on the floor) or choosing a different data structure than a FIFO.
The 'simplest' solution
Do not process the next element until the previous element is processed succesfully.
public void run() {
Object lastSuccessfullyProcessedElement = null;
while (!exit) {
Object obj = lastSuccessfullyProcessedElement == null ? queue.take() : lastSuccessfullyProcessedElement; // blocking
boolean successful = process(obj);
if(!successful) {
lastSuccessfullyProcessedElement = obj;
} else {
lastSuccessfullyProcessedElement = null;
}
}
}
Calling peek() and checking if the value is null is not CPU efficient.
I have seen CPU usage going to 10% on my system when the queue is empty for the following program.
while (true) {
Object o = queue.peek();
if(o == null) continue;
// omitted for the sake of brevity
}
Adding sleep() adds slowness.
Adding it back to the queue using putLast will disturb the order. Moreover, it is a blocking operation which requires locks.
I've got a question about CompletableFuture and its possible usage for lazy computations.
It seems like it is a great substitute for RunnableFuture for this task since it is possible to easily create task chains and to have total control of each chain link. Still I found that it is very hard to control when exactly does the computation take place.
If I just create a CompletableFuture with supplyAssync method or something like that, it is OK. It waits patiently for me to call get or join method to compute. But if I try to make an actual chain with whenCompose, handle or any other method, the evaluation starts immediately, which is very frustrating.
Of course, I can always place some blocker task at the start of the chain and release the block when I am ready to begin calculation, but it seems a bit ugly solution. Does anybody know how to control when does CompletableFuture actually run.
CompletableFuture is a push-design, i.e. results are pushed down to dependent tasks as soon as they become available. This also means side-chains that are not in themselves consumed still get executed, which can have side-effects.
What you want is a pull-design where ancestors would only be pulled in as their data is consumed.
This would be a fundamentally different design because side-effects of non-consumed trees would never happen.
Of course with enough contortions CF could be made to do what you want, but you should look into the fork-join framework instead which allows you to only run the computations you depend on instead of pushing down results.
There's a conceptual difference between RunnableFuture and CompletableFuture that you're missing here.
RunnableFuture implementations take a task as input and hold onto it. It runs the task when you call the run method.
A CompletableFuture does not hold onto a task. It only knows about the result of a task. It has three states: complete, incomplete, and completed exceptionally (failed).
CompletableFuture.supplyAsync is a factory method that gives you an incomplete CompletableFuture. It also schedules a task which, when it completes, will pass its result to the CompletableFuture's complete method. In other words, the future that supplyAsync hands you doesn't know anything about the task, and can't control when the task runs.
To use a CompletableFuture in the way you describe, you would need to create a subclass:
public class RunnableCompletableFuture<T> extends CompletableFuture<T> implements RunnableFuture<T> {
private final Callable<T> task;
public RunnableCompletableFuture(Callable<T> task) {
this.task = task;
}
#Override
public void run() {
try {
complete(task.call());
} catch (Exception e) {
completeExceptionally(e);
}
}
}
A simple way of dealing with your problem is wrapping your CompletableFuture in something with a lazy nature. You could use a Supplier or even Java 8 Stream.
it is late, but how about using constructor for first CompletableFuture in the chain?
CompletableFuture<Object> cf = new CompletableFuture<>();
// compose the chain
cf.thenCompose(sometask_here);
// later starts the chain with
cf.complete(anInputObject);
I was reading a tutorial:
http://code.tutsplus.com/tutorials/getting-started-with-reactivex-on-android--cms-24387
which concers RxAndroid in particular but it's pretty much the same as in RxJava. I am not sure that I understood the concept completely.
Below I have written a method and then a sample usage.
My question is: is this the right way to implement my functions so that I can run them on other threads asynchronously? They will in fact only return a created Observable running the real code, and handling errors and all that stuff.
Or is this wrong, then I'd like to know the correct way.
Observable<String> googleSomething(String text){
return Observable.create(new Observable(){
#Override
public void call(Subscriber<? super String> subscriber) {
try {
String data = fetchData(text); // some normal method
subscriber.onNext(data); // Emit the contents of the URL
subscriber.onCompleted(); // Nothing more to emit
} catch(Exception e) {
subscriber.onError(e); // In case there are network errors
}
}
});
}
googleSomething("hello world").subscribeOn(Schedulers.io()).observeOn(Schedulers.immediate()).subscribe(...)
Also is Schedulers.immediate() used in order to execute the subscriber code on the current thread? It says "Creates and returns a Scheduler that executes work immediately on the current thread." in javadoc, but I'm not sure.
Unless you are more experienced and need a custom operator or want to bridge a legacy addListener/removeListener based API you should not start with create. There are several questions on StackOverflow which used create and was the source of trouble.
I'd prefer fromCallable which let's you generate a single value or throw an Exception thus no need for those lengthy defer + just sources.
Schedulers.immediate() executes its task immediately on the caller's thread, which is the io() thread in your example, not the main thread. Currently, there is no support for moving back the computation to the Java main thread as it requires blocking trampolining and usually a bad idea anyway.
You should almost never use create(), especially not as a beginner. There are easier ways to create observables, and create() is difficult to implement correctly.
Most of the time, you can easily get around create() by using defer(). E.g., in this case you'd do:
Observable<String> googleSomething(String text) {
return Observable.defer(new Func0<Observable<String>>() {
#Override
public Observable<String> call() {
try {
return Observable.just(fetchData(text));
} catch (IOException e) {
return Observable.error(e);
}
}
});
}
If you're not using a checked exception, then you could even get rid of the try-catch. RxJava will automatically forward any RuntimeException to the onError() part of the subscriber.
You can create Observable via Observable.create(new OnSubscribe {}) method however:
Look at defer() operator, which allows you to return for example Observable.just() and Observable.error() so you don't need to touch subscriber directly
Prefer using SyncOnSubscribe/AsyncOnSubscribe to handle backpressure
Schedulers.immediate() will keep Observable processing on the thread it already is - so in your case it will be one of the Schedulers.io threads
Your code looks good to me. If you are unsure wether that is running on another thread or not. you could print something immediately after you call .subscribe() and see the order of the outputs.
googleSomething("hello world").subscribeOn(Schedulers.io()).observeOn(Schedulers.immediate()).subscribe(...)
System.out.println("This should be printed first");
Try to simulate a long running operation inside fetchData() and print something else immediately afterwards. As .subscribe() is non blocking "This should be printed first" is in fact, going to be printed first.
Alternatively, you can print the current thread using.
Thread.currentThread().getName()
Use this inside and outside your observable and the outputs should differ.
I'm implementing the Future<Collection<Integer>> interface in order to share the result of some bulk computation among all thread in the application.
In fact, I intended to just put an instance of a class implemetnting Future<Collection<Integer>> into an ApplicationScope object so that any other thread which need the result just ask for the Future from the object and call the method get() on it, therefore using the computation performed by some another thread.
My question is about implementing the cancel method. For now, I would write something like that:
public class CustomerFutureImpl implements Future<Collection<Integer>>{
private Thread computationThread;
private boolean started;
private boolean cancelled;
private Collection<Integer> computationResult;
private boolean cancel(boolean mayInterruptIfRunning){
if( computationResult != null )
return false;
if( !started ){
cancelled = true;
return true;
} else {
if(mayInterruptIfRunning)
computationThread.interrupt();
}
}
//The rest of the methods
}
But the method implementation doesn't satisfy the documentation of the Future because we need to throw CancellationException in any thread awaiting for the result (has called the get() method).
Should I add another one field like private Collection<Thread> waitingForTheResultThreads; and then interrupt each thread from the Collection, catch interrupted exception and then throw new CancellationException()?
The thing is that such a solution seems kind of wierd to me... Not sure about that.
Generally you should avoid implementing Future directly at all. Concurrency code is very hard to get right, and frameworks for distributed execution - notably ExecutorService - will provide Future instances referencing the units of work you care about.
You may know that already and are intentionally creating a new similar service, but I feel it's important to call out that for the vast majority of use cases, you should not need to define your own Future implementation.
You might want to look at the concurrency tools Guava provides, in particular ListenableFuture, which is a sub-interface of Future that provides additional features.
Assuming that you really do want to define a custom Future type, use Guava's AbstractFuture implementation as a starting point, so that you don't have to reinvent the complex details you're running into.
To your specific question, if you look at the implementation of AbstractFuture.get(), you'll see that it's implemented with a while loop that looks for value to become non-null, at which time it calls getDoneValue() which either returns the value or raises a CancellationException. So essentially, each thread that is blocking on a call to Future.get() is polling the Future.value field every so often and raising a CancellationException if it detects that the Future has been cancelled. There's no need to keep track of a Collection<Thread> or anything of the sort, since each thread can inspect the state of the Future independently, and return or throw as needed.
I am playing with Java 8 completable futures. I have the following code:
CountDownLatch waitLatch = new CountDownLatch(1);
CompletableFuture<?> future = CompletableFuture.runAsync(() -> {
try {
System.out.println("Wait");
waitLatch.await(); //cancel should interrupt
System.out.println("Done");
} catch (InterruptedException e) {
System.out.println("Interrupted");
throw new RuntimeException(e);
}
});
sleep(10); //give it some time to start (ugly, but works)
future.cancel(true);
System.out.println("Cancel called");
assertTrue(future.isCancelled());
assertTrue(future.isDone());
sleep(100); //give it some time to finish
Using runAsync I schedule execution of a code that waits on a latch. Next I cancel the future, expecting an interrupted exception to be thrown inside. But it seems that the thread remains blocked on the await call and the InterruptedException is never thrown even though the future is canceled (assertions pass). An equivalent code using ExecutorService works as expected. Is it a bug in the CompletableFuture or in my example?
When you call CompletableFuture#cancel, you only stop the downstream part of the chain. Upstream part, i. e. something that will eventually call complete(...) or completeExceptionally(...), doesn't get any signal that the result is no more needed.
What are those 'upstream' and 'downstream' things?
Let's consider the following code:
CompletableFuture
.supplyAsync(() -> "hello") //1
.thenApply(s -> s + " world!") //2
.thenAccept(s -> System.out.println(s)); //3
Here, the data flows from top to bottom - from being created by supplier, through being modified by function, to being consumed by println. The part above particular step is called upstream, and the part below is downstream. E. g. steps 1 and 2 are upstream for step 3.
Here's what happens behind the scenes. This is not precise, rather it's a convenient mind model of what's going on.
Supplier (step 1) is being executed (inside the JVM's common ForkJoinPool).
The result of the supplier is then being passed by complete(...) to the next CompletableFuture downstream.
Upon receiving the result, that CompletableFuture invokes next step - a function (step 2) which takes in previous step result and returns something that will be passed further, to the downstream CompletableFuture's complete(...).
Upon receiving the step 2 result, step 3 CompletableFuture invokes the consumer, System.out.println(s). After consumer is finished, the downstream CompletableFuture will receive it's value, (Void) null
As we can see, each CompletableFuture in this chain has to know who are there downstream waiting for the value to be passed to their's complete(...) (or completeExceptionally(...)). But the CompletableFuture don't have to know anything about it's upstream (or upstreams - there might be several).
Thus, calling cancel() upon step 3 doesn't abort steps 1 and 2, because there's no link from step 3 to step 2.
It is supposed that if you're using CompletableFuture then your steps are small enough so that there's no harm if a couple of extra steps will get executed.
If you want cancellation to be propagated upstream, you have two options:
Implement this yourself - create a dedicated CompletableFuture (name it like cancelled) which is checked after every step (something like step.applyToEither(cancelled, Function.identity()))
Use reactive stack like RxJava 2, ProjectReactor/Flux or Akka Streams
Apparently, it's intentional. The Javadoc for the method CompletableFuture::cancel states:
[Parameters:] mayInterruptIfRunning - this value has no effect in this implementation because interrupts are not used to control processing.
Interestingly, the method ForkJoinTask::cancel uses almost the same wording for the parameter mayInterruptIfRunning.
I have a guess on this issue:
interruption is intended to be used with blocking operations, like sleep, wait or I/O operations,
but neither CompletableFuture nor ForkJoinTask are intended to be used with blocking operations.
Instead of blocking, a CompletableFuture should create a new CompletionStage, and cpu-bound tasks are a prerequisite for the fork-join model. So, using interruption with either of them would defeat their purpose. And on the other hand, it might increase complexity, that's not required if used as intended.
If you actually want to be able to cancel a task, then you have to use Future itself (e.g. as returned by ExecutorService.submit(Callable<T>), not CompletableFuture. As pointed out in the answer by nosid, CompletableFuture completely ignores any call to cancel(true).
My suspicion is that the JDK team did not implement interruption because:
Interruption was always hacky, difficult for people to understand, and difficult to work with. The Java I/O system is not even interruptible, despite calls to InputStream.read() being blocking calls! (And the JDK team have no plans to make the standard I/O system interruptible again, like it was in the very early Java days.)
The JDK team have been trying very hard to phase out old broken APIs from the early Java days, such as Object.finalize(), Object.wait(), Thread.stop(), etc. I believe Thread.interrupt() is considered to be in the category of things that must be eventually deprecated and replaced. Therefore, newer APIs (like ForkJoinPool and CompletableFuture) are already not supporting it.
CompletableFuture was designed for building DAG-structured pipelines of operations, similar to the Java Stream API. It's very dificult to succinctly describe how interruption of one node of a dataflow DAG should affect execution in the rest of the DAG. (Should all concurrent tasks be canceled immediately, when any node is interrupted?)
I suspect the JDK team just didn't want to deal with getting interruption right, given the levels of internal complexity that the JDK and libraries have reached these days. (The internals of the lambda system -- ugh.)
One very hacky way around this would be to have each CompletableFuture export a reference to itself to an externally-visible AtomicReference, then the Thread reference could be interrupted directly when needed from another external thread. Or if you start all the tasks using your own ExecutorService, in your own ThreadPool, you can manually interrupt any or all the threads that were started, even if CompletableFuture refuses to trigger interruption via cancel(true). (Note though that CompletableFuture lambdas cannot throw checked exceptions, so if you have an interruptible wait in a CompletableFuture, you'll have to re-throw as an unchecked exception.)
More simply, you could just declare an AtomicReference<Boolean> cancel = new AtomicReference<>() in an external scope, and periodically check this flag from inside each CompletableFuture task's lambda.
You could also try setting up a DAG of Future instances rather than a DAG of CompletableFuture instances, that way you can exactly specify how exceptions and interruption/cancellation in any one task should affect the other currently-running tasks. I show how to do this in my example code in my question here, and it works well, but it's a lot of boilerplate.
You need an alternative implementation of CompletionStage to accomplish true thread interruption. I've just released a small library that serves exactly this purpose - https://github.com/vsilaev/tascalate-concurrent
The call to wait will still block even if Future.cancel(..) is called. As mentioned by others the CompletableFuture will not use interrupts to cancel the task.
According to the javadoc of CompletableFuture.cancel(..):
mayInterruptIfRunning this value has no effect in this implementation because interrupts are not used to control processing.
Even if the implementation would cause an interrupt, you would still need a blocking operation in order to cancel the task or check the status via Thread.interrupted().
Instead of interrupting the Thread, which might not be always easy to do, you may have check points in your operation where you can gracefully terminate the current task. This can be done in a loop over some elements that will be processed or you check before each step of the operation for the cancel status and throw an CancellationException yourself.
The tricky part is to get a reference of the CompletableFuture within the task in order to call Future.isCancelled(). Here is an example of how it can be done:
public abstract class CancelableTask<T> {
private CompletableFuture<T> task;
private T run() {
try {
return compute();
} catch (Throwable e) {
task.completeExceptionally(e);
}
return null;
}
protected abstract T compute() throws Exception;
protected boolean isCancelled() {
Future<T> future = task;
return future != null && future.isCancelled();
}
public Future<T> start() {
synchronized (this) {
if (task != null) throw new IllegalStateException("Task already started.");
task = new CompletableFuture<>();
}
return task.completeAsync(this::run);
}
}
Edit: Here the improved CancelableTask version as a static factory:
public static <T> CompletableFuture<T> supplyAsync(Function<Future<T>, T> operation) {
CompletableFuture<T> future = new CompletableFuture<>();
return future.completeAsync(() -> operation.apply(future));
}
here is the test method:
#Test
void testFuture() throws InterruptedException {
CountDownLatch started = new CountDownLatch(1);
CountDownLatch done = new CountDownLatch(1);
AtomicInteger counter = new AtomicInteger();
Future<Object> future = supplyAsync(task -> {
started.countDown();
while (!task.isCancelled()) {
System.out.println("Count: " + counter.getAndIncrement());
}
System.out.println("Task cancelled");
done.countDown();
return null;
});
// wait until the task is started
assertTrue(started.await(5, TimeUnit.SECONDS));
future.cancel(true);
System.out.println("Cancel called");
assertTrue(future.isCancelled());
assertTrue(future.isDone());
assertTrue(done.await(5, TimeUnit.SECONDS));
}
If you really want to use interrupts in addition to the CompletableFuture, then you can pass a custom Executor to CompletableFuture.completeAsync(..) where you create your own Thread, override cancel(..) in the CompletableFuture and interrupt your Thread.
The CancellationException is part of the internal ForkJoin cancel routine. The exception will come out when you retrieve the result of future:
try { future.get(); }
catch (Exception e){
System.out.println(e.toString());
}
Took a while to see this in a debugger. The JavaDoc is not that clear on what is happening or what you should expect.