When experimenting with CompletableFuture, I was wondering if the given code is safe.
CompletableFuture<Integer> foo = CompletableFuture.supplyAsync(() -> 42);
foo.thenApply((bar) -> {
System.out.println("bar " + bar);
return bar;
})
.acceptEither(foo.thenApply((baz) -> {
System.out.println("baz " + baz);
return baz;
}),
(z) -> System.out.println("finished processing of " + z));
It works, printing
bar 42
baz 42
finished processing of 42
Is it safe/a good idea to call thenApply or other methods more than once on a given instance of CompletableFuture?
What you describe is not a “reuse” of a CompleteableFuture, as it still performs only one action and completes at most once.
You are just registering multiple dependent stages, which is completely within the foreseen usage. At no point, the documentation suggested that dependent stages had to form a single linear chain. The use case is already described by the interface CompletionStage<T>:
A stage of a possibly asynchronous computation, that performs an action or computes a value when another CompletionStage completes. A stage completes upon termination of its computation, but this may in turn trigger other dependent stages.
Note the use of “other dependent stages” rather then “another dependent stage”. The entire documentation consistently uses plural when it comes to dependent stages. This also applies to the documentation of the CompletableFuture<T> implementation class.
After all, when you have two actions a and b which have no dependency to each other but a dependency to c, the only thing that makes sense, is to chain both to c rather than create a chain like c → a → b or c → b → a as the latter would inhibit the concurrent execution of the independent actions a and b, which is the entire point of the concurrency API.
Note that when modelling the dependency this way, there is no guaranty of the outcome you have shown. The two printing actions “bar” and “baz” have no dependency on each other, but only on the supplier of the 42 value, and the action which prints “finished processing” is scheduled for the completion of either of them, not both. So besides the indeterminate order of the “bar” and “baz” output, you could also see a log like “bar, finished processing, baz” or “baz, finished processing, bar”.
Related
Using non-blocking calls I want to generically take a Mono and call a method that returns a Flux and for each item in the Flux, call a method that returns Mono to return a Flux which is a an aggregate object of Bar + Foo + Bar and has as many elements as the Flux method returns (will return).
As a concrete example:
Methods:
Flux<Bar> getBarsByFoo(Foo foo);
Mono<More> getMoreByBar(Bar bar);
Combined getCombinedFrom(Bar bar, Foo foo, More more);
Working code section:
Flux<Combined> getCombinedByFoo(Foo foo) {
getBarsByFoo(foo)...
}
From a blocking perspective what I want to accomplish is:
List<Combined> getCombinedByFoo(Foo foo) {
List<Bar> bars = getBarsByFoo(foo):
List<Combined> combinedList = new ArrayList<>(bars.size());
for (Bar bar: bars) {
More more = getMoreByBar(bar);
combinedList.append(getCombinedFrom(bar, foo, more));
}
return combinedList;
}
Any help on which Flux and Mono methods to use would be appreciated. I am still learning to change my brain into non-blocking thinking. Conceptually, I think there is a function to apply to each element (Bar) in from getBarsByFoo(Foo foo) to somehow map that to the combined element...
I like to think about Reactor programming as a flow of operations (as in flow programming), as a chain/DAG of operation.
In your case, you want to:
map each emitted Bar object to a Combined object.
Along the way, you need to use/call another publisher to fetch additional information:
you need to wait for it to complete so you can fetch its output value. In the case of Monads/streams, there's a flatMap operation for it.
flatMap waits for (or you can say that it extracts) a different publisher value to integrate it in the current chain of operations. I think it is called flatMap because in a sense, we break a level of hierarchy to flatten two nested publishers/monads in a single merged one.
The following example show a reactive version of your method (for a less verbose version, see Toerktumlare answer:
Flux<Combined> combine(Foo foo) {
Flux<Bar> bars = getBarBy(foo);
Flux<Combined> result = bars.flatMap(bar -> {
Mono<More> nextMore = getMoreBy(bar);
Mono<Combined> next = nextMore.map(more -> getCombinedFrom(foo, bar, more));
return next;
);
return result;
}
If you get your foo object through a Mono, you can just call flatMapMany on it:
Mono<Foo> nextFoo = ...;
Flux<Combined> = nextFoo.flatMapMany(foo -> combine(foo));
WARNING
flatMap is very powerful: it can trigger concurrent execution of the provided operation. In your case, it means that many getMoreBy(bar) operations can be launched at the same time. But it is a double-edged sword, because then it means that:
ordering of elements is not preserved (or at least, there's no guarantee)
In resource constrained system, having multiple operations launched at the same time could hurt performance or cause harm to the system (like, too many files open, etc.)
The concurrency behavior is quite high by default (256) and can be controlled in different ways:
flatMap accepts an optional concurrency argument, to adapt the number of tasks allowed to run at the same time.
There are other operators that flatten publishers, but manage work differently, like concatMap: it enforces sequential execution (and therefore, preserve ordering) of mapping tasks.
Something like this:
Flux<Combined> getCombinedByFoo(Foo foo) {
return getBarsByFoo(foo)
.flatMap(bar -> getMoreByBar(bar)
.flatMap(more -> getCombinedFrom(bar, foo, more)))
}
I dont have an idea to check, i wrote this by free hand but something like this i guess.
This question is different from this one Difference between Java8 thenCompose and thenComposeAsync because I want to know what is the writer's reason for using thenCompose and not thenComposeAsync.
I was reading Modern Java in action and I came across this part of code on page 405:
public static List<String> findPrices(String product) {
ExecutorService executor = Executors.newFixedThreadPool(10);
List<Shop> shops = Arrays.asList(new Shop(), new Shop());
List<CompletableFuture<String>> priceFutures = shops.stream()
.map(shop -> CompletableFuture.supplyAsync(() -> shop.getPrice(product), executor))
.map(future -> future.thenApply(Quote::parse))
.map(future -> future.thenCompose(quote ->
CompletableFuture.supplyAsync(() -> Discount.applyDiscount(quote), executor)))
.collect(toList());
return priceFutures.stream()
.map(CompletableFuture::join).collect(toList());
}
Everything is Ok and I can understand this code but here is the writer's reason for why he didn't use thenComposeAsync on page 408 which I can't understand:
In general, a method without the Async suffix in its name executes
its task in the same threads the previous task, whereas a method
terminating with Async always submits the succeeding task to the
thread pool, so each of the tasks can be handled by a
different thread. In this case, the result of the second
CompletableFuture depends on the first,so it makes no difference to
the final result or to its broad-brush timing whether you compose the
two CompletableFutures with one or the other variant of this method
In my understanding with the thenCompose( and thenComposeAsync) signatures as below:
public <U> CompletableFuture<U> thenCompose(
Function<? super T, ? extends CompletionStage<U>> fn) {
return uniComposeStage(null, fn);
}
public <U> CompletableFuture<U> thenComposeAsync(
Function<? super T, ? extends CompletionStage<U>> fn) {
return uniComposeStage(asyncPool, fn);
}
The result of the second CompletableFuture can depends on the previous CompletableFuture in many situations(or rather I can say almost always), should we use thenCompose and not thenComposeAsync in those cases?
What if we have blocking code in the second CompletableFuture?
This is a similar example which was given by person who answered similar question here: Difference between Java8 thenCompose and thenComposeAsync
public CompletableFuture<String> requestData(Quote quote) {
Request request = blockingRequestForQuote(quote);
return CompletableFuture.supplyAsync(() -> sendRequest(request));
}
To my mind in this situation using thenComposeAsync can make our program faster because here blockingRequestForQuote can be run on different thread. But based on the writer's opinion we should not use thenComposeAsync because it depends on the first CompletableFuture result(that is Quote).
My question is:
Is the writer's idea correct when he said :
In this case, the result of the second
CompletableFuture depends on the first,so it makes no difference to
the final result or to its broad-brush timing whether you compose the
two CompletableFutures with one or the other variant of this method
TL;DR It is correct to use thenCompose instead of thenComposeAsync here, but not for the cited reasons. Generally, the code example should not be used as a template for your own code.
This chapter is a recurring topic on Stackoverflow for reasons we can best describe as “insufficient quality”, to stay polite.
In general, a method without the Async suffix in its name executes its task in the same threads the previous task, …
There is no such guaranty about the executing thread in the specification. The documentation says:
Actions supplied for dependent completions of non-async methods may be performed by the thread that completes the current CompletableFuture, or by any other caller of a completion method.
So there’s also the possibility that the task is performed “by any other caller of a completion method”. An intuitive example is
CompletableFuture<X> f = CompletableFuture.supplyAsync(() -> foo())
.thenApply(f -> f.bar());
There are two threads involved. One that invokes supplyAsync and thenApply and the other which will invoke foo(). If the second completes the invocation of foo() before the first thread enters the execution of thenApply, it is possible that the future is already completed.
A future does not remember which thread completed it. Neither does it have some magic ability to tell that thread to perform an action despite it might be busy with something else or even have terminated since then. So it should be obvious that calling thenApply on an already completed future can’t promise to use the thread that completed it. In most cases, it will perform the action immediately in the thread that calls thenApply. This is covered by the specification’s wording “any other caller of a completion method”.
But that’s not the end of the story. As this answer explains, when there are more than two threads involved, the action can also get performed by another thread calling an unrelated completion method on the future at the same time. This may happen rarely, but it’s possible in the reference implementation and permitted by the specification.
We can summarize it as: Methods without Async provides the least control over the thread that will perform the action and may even perform it right in the calling thread, leading to synchronous behavior.
So they are best when the executing thread doesn’t matter and you’re not hoping for background thread execution, i.e. for short, non-blocking operations.
whereas a method terminating with Async always submits the succeeding task to the thread pool, so each of the tasks can be handled by a different thread. In this case, the result of the second CompletableFuture depends on the first, …
When you do
future.thenCompose(quote ->
CompletableFuture.supplyAsync(() -> Discount.applyDiscount(quote), executor))
there are three futures involved, so it’s not exactly clear, which future is meant by “second”. supplyAsync is submitting an action and returning a future. The submission is contained in a function passed to thenCompose, which will return another future.
If you used thenComposeAsync here, you only mandated that the execution of supplyAsync has to be submitted to the thread pool, instead of performing it directly in the completing thread or “any other caller of a completion method”, e.g. directly in the thread calling thenCompose.
The reasoning about dependencies makes no sense here. “then” always implies a dependency. If you use thenComposeAsync here, you enforced the submission of the action to the thread pool, but this submission still won’t happen before the completion of future. And if future completed exceptionally, the submission won’t happen at all.
So, is using thenCompose reasonable here? Yes it is, but not for the reasons given is the quote. As said, using the non-async method implies giving up control over the executing thread and should only be used when the thread doesn’t matter, most notably for short, non-blocking actions. Calling supplyAsync is a cheap action that will submit the actual action to the thread pool on its own, so it’s ok to perform it in whatever thread is free to do it.
However, it’s an unnecessary complication. You can achieve the same using
future.thenApplyAsync(quote -> Discount.applyDiscount(quote), executor)
which will do exactly the same, submit applyDiscount to executor when future has been completed and produce a new future representing the result. Using a combination of thenCompose and supplyAsync is unnecessary here.
Note that this very example has been discussed in this Q&A already, which also addresses the unnecessary segregation of the future operations over multiple Stream operations as well as the wrong sequence diagram.
What a polite answer from Holger! I am really impressed he could provide such a great explanation and at the same time staying in bounds of not calling the author plain wrong. I want to provide my 0.02$ here too, a little, after reading the same book and having to scratch my head twice.
First of all, there is no "remembering" of which thread executed which stage, neither does the specification make such a statement (as already answered above). The interesting part is even in the cited above documentation:
Actions supplied for dependent completions of non-async methods may be performed by the thread that completes the current CompletableFuture, or by any other caller of a completion method.
Even that ...completes the current CompletableFuture part is tricky. What if there are two threads that try to call complete on a CompletableFuture, which thread will run all the dependent actions? The one that has actually completed it? Or any other? I wrote a jcstress test that is very non-intuitive when looking at the results:
#JCStressTest
#State
#Outcome(id = "1, 0", expect = Expect.ACCEPTABLE, desc = "executed in completion thread")
#Outcome(id = "0, 1", expect = Expect.ACCEPTABLE, desc = "executed in the other thread")
#Outcome(id = "0, 0", expect = Expect.FORBIDDEN)
#Outcome(id = "1, 1", expect = Expect.FORBIDDEN)
public class CompletableFutureWhichThread1 {
private final CompletableFuture<String> future = new CompletableFuture<>();
public CompletableFutureWhichThread1() {
future.thenApply(x -> action(Thread.currentThread().getName()));
}
volatile int x = -1; // different default to not mess with the expected result
volatile int y = -1; // different default to not mess with the expected result
volatile int actor1 = 0;
volatile int actor2 = 0;
private String action(String threadName) {
System.out.println(Thread.currentThread().getName());
// same thread that completed future, executed action
if ("actor1".equals(threadName) && actor1 == 1) {
x = 1;
return "action";
}
// same thread that completed future, executed action
if ("actor2".equals(threadName) && actor2 == 1) {
x = 1;
return "action";
}
y = 1;
return "action";
}
#Actor
public void actor1() {
Thread.currentThread().setName("actor1");
boolean completed = future.complete("done-actor1");
if (completed) {
actor1 = 1;
} else {
actor2 = 1;
}
}
#Actor
public void actor2() {
Thread.currentThread().setName("actor2");
boolean completed = future.complete("done-actor2");
if (completed) {
actor2 = 1;
}
}
#Arbiter
public void arbiter(II_Result result) {
if (x == 1) {
result.r1 = 1;
}
if (y == 1) {
result.r2 = 1;
}
}
}
After running this, both 0, 1 and 1, 0 are seen. You do not need to understand very much about the test itself, but it proves a rather interesting point.
You have a CompletableFuture future that has a future.thenApply(x -> action(...)); attached to it. There are two threads (actor1 and actor2) that both, at the same time, compete with each other into completing it (the specification says that only one will be successful). The results show that if actor1 called complete, but does not actually complete the CompletableFuture (actor2 did), it can still do the actual work in action. In other words, a thread that completed a CompletableFuture is not necessarily the thread that executes the dependent actions (those thenApply for example). This was rather interesting for me to find out, though it makes sense.
Your reasonings about speed are a bit off. When you dispatch your work to a different thread, you usually pay a penalty for that. thenCompose vs thenComposeAsync is about being able to predict where exactly is your work going to happen. As you have seen above you can not do that, unless you use the ...Async methods that take a thread pool. Your natural question should be : "Why do I care where it is executed?".
There is an internal class in jdk's HttpClient called SelectorManager. It has (from a high level) a rather simple task: it reads from a socket and gives "responses" back to the threads that wait for a http result. In essence, this is a thread that wakes up all interested parties that wait for some http packets. Now imagine that this particular thread does internally thenCompose. Now also imagine that your chain of calls looks like this:
httpClient.sendAsync(() -> ...)
.thenApply(x -> foo())
where foo is a method that never finishes (or takes a lot of time to finish). Since you have no idea in which thread the actual execution is going to happen, it can, very well, happen in SelectorManager thread. Which would be a disaster. Everyone other http calls would stale, because this thread is busy now. Thus thenComposeAsync: let the configured pool do the work/waiting if needed, while the SelectorManager thread is free to do its work.
So the reasons that the author gives are plain wrong.
I have two completionStages method calls that each call a remote service if a condition is not met. They are both pretty long running processes and we need to decrease latency. I also do not care for the secondFuture's response. It could return CompletionStage<Void> as I only care if the method runs before we exit the main method. An added complexity is that injectedClass2.serviceCall also throws a really important exception (404 StatusRuntimeException) that needs to be surfaced to the client.
How do I ensure that the first and second future run asynchronously (not dependant on each other) meanwhile the second future surfaces its error codes and exceptions for the client.
Main Method below is my best attempt at this. It works, but I am looking to learn a better implementation that takes advantage of completables/streams,etc.
try {
.
.
.
CompletionStage<Response> firstFuture;
CompletionStage<Response> secondFuture = CompletableFuture.completedFuture(Response.default());
if (condition) {
firstFuture = legacyImplThing.resolve(param1, param2);
} else {
firstFuture =
injectedClass1.longRunningOp(param1, param2);
secondFuture = injectedClass2.serviceCall(param1, param2, someOtherData);
}
final CompletionStage<MainMethodResponse> response =
CompletableFutures.combine(firstFuture, secondFuture, (a, b) -> a)
.thenApply(
v -> ServiceResponse.newBuilder().setParam(v.toString()).build());
handleResponse(response, responseObserver);
} catch (Exception e) {
responseObserver.onError(e);
}
Maybe out of scope, how would one test/check that two completionStages were run concurrently?
EDIT: CompletableFutures.combine() is a third-party library method and not part of the java.util.concurrent package.
Chaining other stages does not alter the previous stages. In other words, the parallelism is entirely outside your control, as it has been determined already.
More specifically, when you invoke injectedClass1.longRunningOp(param1, param2), the implementation of the method longRunningOp decides, how the returned future will be completed. Likewise, when you call injectedClass2.serviceCall(param1, param2, someOtherData), the implementation of serviceCall will determine the returned future’s completion. Both methods could use the same executor behind the scenes or entirely different approaches.
The only scenario where you can influence the parallelism, is that both methods perform the actual operation in the caller’s thread, to eventually return an already completed future. In this case, you would have to wrap each call into another asynchronous operation to let them run in parallel. But it would be a strange design to return a future when performing a lengthy operation in the caller’s thread.
Your code
CompletableFutures.combine(firstFuture, secondFuture, (a, b) -> a)
does not match the documented API. A valid call would be
firstFuture.thenCombine(secondFuture, (a, b) -> a)
In this case, you are not influencing the completions of firstFuture or secondFuture. You are only specifying what should happen after both futures have been completed.
There is, by the way, no reason to specify a trivial function like (a, b) -> a in thenCombine, just to chain another thenApply. You can use
firstFuture.thenCombine(secondFuture,
(v, b) -> ServiceResponse.newBuilder().setParam(v.toString()).build())
in the first place.
I have a question about CompletableFuture method:
public <U> CompletableFuture<U> thenApply(Function<? super T, ? extends U> fn)
The thing is the JavaDoc says just this:
Returns a new CompletionStage that, when this stage completes
normally, is executed with this stage's result as the argument to the
supplied function. See the CompletionStage documentation for rules
covering exceptional completion.
What about threading? In which thread is this going to be executed? What if the future is completed by a thread pool?
As #nullpointer points out, the documentation tells you what you need to know. However, the relevant text is surprisingly vague, and some of the comments (and answers) posted here seem to rely on assumptions that aren't supported by the documentation. Thus, I think it's worthwhile to pick it apart. Specifically, we should read this paragraph very carefully:
Actions supplied for dependent completions of non-async methods may be performed by the thread that completes the current CompletableFuture, or by any other caller of a completion method.
Sounds straightforward enough, but it's light on details. It seemingly deliberately avoids describing when a dependent completion may be invoked on the completing thread versus during a call to a completion method like thenApply. As written, the paragraph above is practically begging us to fill in the gaps with assumptions. That's dangerous, especially when the topic concerns concurrent and asynchronous programming, where many of the expectations we've developed as programmers get turned on their head. Let's take a careful look at what the documentation doesn't say.
The documentation does not claim that dependent completions registered before a call to complete() will run on the completing thread. Moreover, while it states that a dependent completion might be invoked when calling a completion method like thenApply, it does not state that a completion will be invoked on the thread that registers it (note the words "any other").
These are potentially important points for anyone using CompletableFuture to schedule and compose tasks. Consider this sequence of events:
Thread A registers a dependent completion via f.thenApply(c1).
Some time later, Thread B calls f.complete().
Around the same time, Thread C registers another dependent completion via f.thenApply(c2).
Conceptually, complete() does two things: it publishes the result of the future, and then it attempts to invoke dependent completions. Now, what happens if Thread C runs after the result value is posted, but before Thread B gets around to invoking c1? Depending on the implementation, Thread C may see that f has completed, and it may then invoke c1 and c2. Alternatively, Thread C may invoke c2 while leaving Thread B to invoke c1. The documentation does not rule out either possibility. With that in mind, here are assumptions that are not supported by the documentation:
That a dependent completion c registered on f prior to completion will be invoked during the call to f.complete();
That c will have run to completion by the time f.complete() returns;
That dependent completions will be invoked in any particular order (e.g., order of registration);
That dependent completions registered before f completes will be invoked before completions registered after f completes.
Consider another example:
Thread A calls f.complete();
Some time later, Thread B registers a completion via f.thenApply(c1);
Around the same time, Thread C registers a separate completion via f.thenApply(c2).
If it is known that f has already run to completion, one might be tempted to assume that c1 will be invoked during f.thenApply(c1) and that c2 will be invoked during f.thenApply(c2). One might further assume that c1 will have run to completion by the time f.thenApply(c1) returns. However, the documentation does not support these assumptions. It may be possible that one of the threads calling thenApply ends up invoking both c1 and c2, while the other thread invokes neither.
A careful analysis of the JDK code could determine how the hypothetical scenarios above might play out. But even that is risky, because you may end up relying on an implementation detail that is (1) not portable, or (2) subject to change. Your best bet is not to assume anything that's not spelled out in the javadocs or the original JSR spec.
tldr: Be careful what you assume, and when you write documentation, be as clear and deliberate as possible. While brevity is a wonderful thing, be wary of the human tendency to fill in the gaps.
The policies as specified in the CompletableFuture docs could help you understand better:
Actions supplied for dependent completions of non-async methods may be
performed by the thread that completes the current CompletableFuture,
or by any other caller of a completion method.
All async methods without an explicit Executor argument are performed
using the ForkJoinPool.commonPool() (unless it does not support a
parallelism level of at least two, in which case, a new Thread is
created to run each task). To simplify monitoring, debugging, and
tracking, all generated asynchronous tasks are instances of the marker
interface CompletableFuture.AsynchronousCompletionTask.
Update: I would also advice on reading this answer by #Mike as an interesting analysis further into the details of the documentation.
From the Javadoc:
Actions supplied for dependent completions of non-async methods may be performed by the thread that completes the current CompletableFuture, or by any other caller of a completion method.
More concretely:
fn will run during the call to complete() in the context of whichever thread has called complete().
If complete() has already finished by the time thenApply() is called, fn will be run in the context of the thread calling thenApply().
When it comes to threading the API documentation is lacking. It takes a bit of inference to understand how threading and futures work. Start with one assumption: the non-Async methods of CompletableFuture do not spawn new threads on their own. Work will proceed under existing threads.
thenApply will run in the original CompletableFuture's thread. That's either the thread that calls complete(), or the one that calls thenApply() if the future is already completed. If you want control over the thread—a good idea if fn is a slow operation—then you should use thenApplyAsync.
I know this question is old, but I want to use source code to explain this question.
public CompletableFuture<Void> thenAccept(Consumer<? super T> action) {
return uniAcceptStage(null, action);
}
private CompletableFuture<Void> uniAcceptStage(Executor e,
Consumer<? super T> f) {
if (f == null) throw new NullPointerException();
Object r;
if ((r = result) != null)
return uniAcceptNow(r, e, f);
CompletableFuture<Void> d = newIncompleteFuture();
unipush(new UniAccept<T>(e, d, this, f));
return d;
}
This is the source code from java 16, and we can see, if we trigger thenAccept, we will pass a null executor service reference into our function.
From the 2nd function uniAcceptStage() 2nd if condition. If result is not null, it will trigger uniAcceptNow()
if (e != null) {
e.execute(new UniAccept<T>(null, d, this, f));
} else {
#SuppressWarnings("unchecked") T t = (T) r;
f.accept(t);
d.result = NIL;
}
if executor service is null, we will use lambda function f.accept(t) to execute it. If we are triggering this thenApply/thenAccept from main thread, it will use main thread as executing thread.
But if we cannot get previous result from last completablefuture, we will push our current UniAccept/Apply into stack by using uniPush function. And UniAccept class has tryFire() which will be triggered from our postComplete() function
final void postComplete() {
/*
* On each step, variable f holds current dependents to pop
* and run. It is extended along only one path at a time,
* pushing others to avoid unbounded recursion.
*/
CompletableFuture<?> f = this; Completion h;
while ((h = f.stack) != null ||
(f != this && (h = (f = this).stack) != null)) {
CompletableFuture<?> d; Completion t;
if (STACK.compareAndSet(f, h, t = h.next)) {
if (t != null) {
if (f != this) {
pushStack(h);
continue;
}
NEXT.compareAndSet(h, t, null); // try to detach
}
f = (d = h.tryFire(NESTED)) == null ? this : d;
}
}
}
I started to play around with RxJava and ReactFX, and I became pretty fascinated with it. But as I'm experimenting I have dozens of questions and I'm constantly researching for answers.
One thing I'm observing (no pun intended) is of course lazy execution. With my exploratory code below, I noticed nothing gets executed until the merge.subscribe(pet -> System.out.println(pet)) is called. But what fascinated me is when I subscribed a second subscriber merge.subscribe(pet -> System.out.println("Feed " + pet)), it fired the "iteration" again.
What I'm trying to understand is the behavior of the iteration. It does not seem to behave like a Java 8 stream that can only be used once. Is it literally going through each String one at a time and posting it as the value for that moment? And do any new subscribers following any previously fired subscribers receive those items as if they were new?
public class RxTest {
public static void main(String[] args) {
Observable<String> dogs = Observable.from(ImmutableList.of("Dasher", "Rex"))
.filter(dog -> dog.matches("D.*"));
Observable<String> cats = Observable.from(ImmutableList.of("Tabby", "Grumpy Cat", "Meowmers", "Peanut"));
Observable<String> ferrets = Observable.from(CompletableFuture.supplyAsync(() -> "Harvey"));
Observable<String> merge = dogs.mergeWith(cats).mergeWith(ferrets);
merge.subscribe(pet -> System.out.println(pet));
merge.subscribe(pet -> System.out.println("Feed " + pet));
}
}
Observable<T> represents a monad, a chained operation, not the execution of the operation itself. It is descriptive language, rather than the imperative you're used to. To execute an operation, you .subscribe() to it. Every time you subscribe a new execution stream is created from scratch. Do not confuse streams with threads, as subscription are executed synchronously unless you specify a thread change with .subscribeOn() or .observeOn(). You chain new elements to any existing operation/monad/Observable to add new behaviour, like changing threads, filtering, accumulation, transformation, etc. In case your observable is an expensive operation you don't want to repeat on every subscription, you can prevent recreation by using .cache().
To make any asynchronous/synchronous Observable<T> operation into a synchronous inlined one, use .toBlocking() to change its type to BlockingObservable<T>. Instead of .subscribe() it contains new methods to execute operations on each result with .forEach(), or coerce with .first()
Observables are a good tool because they're mostly* deterministic (same inputs always yield same outputs unless you're doing something wrong), reusable (you can send them around as part of a command/policy pattern) and for the most part ignore concurrence because they should not rely on shared state (a.k.a. doing something wrong). BlockingObservables are good if you're trying to bring an observable-based library into imperative language, or just executing an operation on an Observable that you have 100% confidence it's well managed.
Architecting your application around these principles is a change of paradigm that I can't really cover on this answer.
*There are breaches like Subject and Observable.create() that are needed to integrate with imperative frameworks.