Project Reactor composing Flux.zip() - java

I have been trying to learn Project Reactor 3.0 with this small application. I am struggling to compose a Flux.zip() function for combining variables to a Movie object. In Reactor it seems like the return type is a Flux<Tuple5<>>. In RxJava2 it returns a Function5<>.
RxJava2
Single<Movie> movie = Single.zip(getDesc(id), getCategory(id), getName(id), getRating(id),
(Function5<Integer, String, String, String, Double, Object>) (desc, cat, name, rating) ->
new Movie(id.blockingGet(), name, desc, cat, rating)).cast(Movie.class);
Reactor
Flux<Tuple5<Integer, String, String, String, Double>> tuple =
Flux.zip(id, getDesc(id), getCategory(id), getName(id), getRating(id));
Instead of returning a Flux<Tuple5<>> I want to return a Tuple5<> or something else to create the movie just like RxJava. I do not want to subscribe to the Tuple since I am trying to return this in Spring Web Reactive. I temporarily solved it by subscribing, but I was wondering if it is possible to do the same as RxJava.
The example in this video on timestamp 1:07:54, shows it was possible in an old version.
Any solutions or suggestions are welcome!

The RxJava solution doesn't return the Movie directly, but a Single<Movie>. Reactor has a simplified zip that returns a Tuple, but that RxJava signature is comparable to Flux<Tuple5>.
So what you want is a Flux<Movie>. zip has an overload that takes a Function<Object[], V> as the first parameter: that lets you specify into which object V the values from the zipped sources are to be combined. The function will be applied with an array of these values as input, and must return the value to be emitted in the resulting Flux<V>, in your case a Movie.

Yes,Zip can be used. It waits for sources to emit an element and combine them in Tuples. Like below publishers are emitting first name, last name and dept. which is being combined to form User flux.
Flux<String> fnameFlux = Flux.just("Ramesh","Amit","Vijay");
Flux<String> lnameFlux = Flux.just("Sharma","Kumar","Lamba");
Flux<String> deptFlux = Flux.just("Admin","IT","Acc.");
Flux<User> userFlux = Flux.zip(fnameFlux, lnameFlux, deptFlux)
.flatMap(dFlux ->
Flux.just(new User(dFlux.getT1(), dFlux.getT2(), dFlux.getT2())));
userFlux.subscribe(x -> System.out.println(x));

Related

How do I convert this code reactive using reactor/Mono?

I am changing a java app to use reactive programming to allow asyncronous and nonblocking flow but I'm having trouble understanding the concepts to achieve this. A stream of siteIds are used to invoke third party APIs and eventually the response is saved into some storage.
The code I have now is blocking and I would like to remove that...
generateReport() returns a Mono< BaseResponse > object
getReportAndSave() retrieves and manipulates the report and saves it, then should return boolean.
listResult = siteIds.parallel()
.map(siteId -> generateReport(authToken, requestParams, siteId))
.map(response -> response.block(Duration.ofMinutes(asyncCallTimeout)))
.map(resp -> getReportAndSave(authToken, resp.getRequestId()))
.collect(Collectors.toList());
So far I have this which should be able to do the same except I dont know how to get a return value for listResult.
siteId.forEach(siteId -> generateReport(authToken, requestParams, siteId)
.subscribe(baseResponse -> getReportAndSave(authToken, baseResponse.getRequestId())));
listResult is a List of Booleans, saying if each siteId has successfully been saved into a blob storage.
final Flux<ResultWrapperBean> resultFlux = Flux.fromIterable(siteIds)
// Since generateReport() returns Mono, here you should use flatMap instead of map.
.flatMap(siteId -> generateReport(authToken, requestParams, siteId))
// Use a wrapper bean to save the request id and request result.
.map(resp -> new ResultWrapperBean(resp.getRequestId(), getReportAndSave(authToken, resp.getRequestId())));
resultFlux.subscribe(resultBean -> log.info("RequestId: {}, and request result is {}", resultBean.getRequestId(), resultBean.getResult()));

Combine two streams and call method

I have a problem how to stream asynchornously and call a method,
e.g.
List<User> users = List.of(user1, user2, user3);
List<Workplace> worklpaces = List.of(workplace1,workplace2,workplace3)
It's always the same users.size == workplaces.size
we have a function mapping
public List<UserWithWorkplace> combineUserWithWorkplaceAndType(List<User> users,List<Workplace>
worklpaces, Type someRandomtype) {
//here is the problem it wont it should be get
//List<UserWithWorkplace>.size == users.size == workplaces.size
return users.stream().flatMap(user ->
worklpaces.stream()
.map(worklpace -> mapping(user,worklpace, someRandomtype)))
.toList()
}
private UserWithWorkplace mapping( User user, Workplace workplace,Type someRandomtype){
//cominging and returning user with workplace
}
How to achieve that result?
Assuming you want to create pairs of (user, workplace) from two separate users an workplaces streams, this operation is normally called "zipping".
Guava library provide Streams.zip(Stream, Steam, Function) method for this. In your case the code would look like:
Stream<UserWithWorkplace> zipped = Streams.zip(
users.stream(),
worklpaces.stream(),
(u, w) -> this.mapping(u, w, someRandomtype));
However your example code uses List and not Stream to represent data. I'm not sure if you have to use Java streams for this, a simple for loop with i index might be easier.
What you're describing is a zipping operation.
If using Google Guava, you can do this to combine them:
Streams.zip(users.stream(), workplaces.stream(), (user, workplace) -> mapping(user, workplace, someType))
You can also find some other implementations of this operation described here

What is the top first use case you think of, when you see the 'flatMap' method in someone else's code?

Sorry for some kind of theoretical question, but I'd like to find a way of quick reading someone else's functional code, building chain of methods use templates.
For example:
Case 1.
When I see use of .peek method or .wireTap from Spring Integration, I primarily expect logging, triggering monitoring or just transitional running external action, for instance:
.peek(params ->
log.info("creating cache configuration {} for key class \"{}\" and value class \"{}\"",
params.getName(), params.getKeyClass(), params.getValueClass()))
or
.peek(p ->
Try.run(() -> cacheService.cacheProfile(p))
.onFailure(ex ->
log.warn("Unable to cache profile: {}", ex.toString())))
or
.wireTap(sf -> sf.handle(msg -> {
monitoring.profileRequestsReceived();
log.trace("Client info request(s) received: {}", msg);
Case 2.
When I see use of .map method or .transform from Spring Integration, I understand that I'm up to get result of someFunction(input), for instance:
.map(e -> GenerateTokenRs.builder().token(e.getKey()).phoneNum(e.getValue()).build())
or
.transform(Message.class, msg -> {
ErrorResponse response = (ErrorResponse) msg.getPayload();
MessageBuilder builder = some tranforming;
return builder.build();
})
Current case.
But I don't have such a common view to .flatMap method.
Would you give me your opinion about this, please?
Add 1:
To Turamarth: I know the difference between .map and .flatMap methods. I actively use both .map, and .flatMap in my code.
But I ask community for theirs experience and coding templates.
It always helps to study the signature/javadoc of the streamish methods to understand them:
The flatMap() operation has the effect of applying a one-to-many transformation to the elements of the stream, and then flattening the resulting elements into a new stream.
So, typical code I expect, or wrote myself:
return someMap.values().stream().flatMap(Collection::stream)
The values of that map are sets, and I want to pull the entries of all these sets into a single stream for further processing here.
In other words: it is about "pulling out things", and getting them into a stream/collection for further processing.
I've found one more use template for .flatMap.
Let's have a look at the following code:
String s = valuesFromDb
.map(v -> v.get(k))
.getOrElse("0");
where Option<Map<String, String>> valuesFromDb = Option.of(.....).
If there's an entry k=null in the map, then we'll get null as a result of code above.
But we'd like to have "0" in this case as well.
So let's add .flatMap:
String s = valuesFromDb
.map(v -> v.get(k))
.flatMap(Option::of)
.getOrElse("0");
Regardless of having null as map's value we will get "0".

What should be used in Flux/Mono to join couple items

In JS promises you can use
Promise.join
But I couldn't find such solution for Flux/Mono. What is best practice when you deal with the different item then have to use them together later?
That depends on how you want to combine them.
Sequentially? Use Flux.concat
All in parallel? Use Flux.zip
If you expect only one result, Mono.zipWith might work for you.
There is a good number of 'merging operators'
zip, concat, merge, combineLatest are the main three.
Zip allows you to combine streams, where the items will be grouped in a 1-to-1 relationship within the stream. That is why you lost the last element.
When you are not sure about how numerous each stream will be and how often it will feed events you can use concat (add the other stream on the end of the first one), merge (where items are placed on the final stream in order of appearence from both streams), or combine latest (to mutate the two last events of each stream into something else).
Your case sounds like a merge to me.
After some changes, my code looks like this
public Mono<Item> createItem(final #NonNull String userName, String description, String[] tags,
#NonNull Flux<ImageDTO> photos) {
val item = initItem(userName);
item.setDescription(description);
if (null != tags) {
item.getTags().addAll(Arrays.asList(tags));
}
return photos.flatMap(photo -> imageService.storeImage(photo.getStream(), photo.getExt()))
.reduce(item, (item1, photoIri) -> {
item1.getPhotos().add(photoIri);
return item1;
})
.flatMap(itemRepository::save)
.flatMap(createdItem -> {
val itemHistory = getHistoryForCreatedItem(userName, createdItem);
return itemHistoryRepository.save(itemHistory).then(Mono.just(createdItem));
});
}
Currently I don't like:
.reduce(item, (item1, photoIri) ->
.then(Mono.just(createdItem))

Can Spark Streaming do Anything Other Than Word Count?

I'm trying to get to grips with Spark Streaming but I'm having difficulty. Despite reading the documentation and analysing the examples I wish to do something more than a word count on a text file/stream/Kafka queue which is the only thing we're allowed to understand from the docs.
I wish to listen to an incoming Kafka message stream, group messages by key and then process them. The code below is a simplified version of the process; get the stream of messages from Kafka, reduce by key to group messages by message key then to process them.
JavaPairDStream<String, byte[]> groupByKeyList = kafkaStream.reduceByKey((bytes, bytes2) -> bytes);
groupByKeyList.foreachRDD(rdd -> {
List<MyThing> myThingsList = new ArrayList<>();
MyCalculationCode myCalc = new MyCalculationCode();
rdd.foreachPartition(partition -> {
while (partition.hasNext()) {
Tuple2<String, byte[]> keyAndMessage = partition.next();
MyThing aSingleMyThing = MyThing.parseFrom(keyAndMessage._2); //parse from protobuffer format
myThingsList.add(aSingleMyThing);
}
});
List<MyResult> results = myCalc.doTheStuff(myThingsList);
//other code here to write results to file
});
When debugging I see that in the while (partition.hasNext()) the myThingsList has a different memory address than the declared List<MyThing> myThingsList in the outer forEachRDD.
When List<MyResult> results = myCalc.doTheStuff(myThingsList); is called there are no results because the myThingsList is a different instance of the List.
I'd like a solution to this problem but would prefer a reference to documentation to help me understand why this is not working (as anticipated) and how I can solve it for myself (I don't mean a link to the single page of Spark documentation but also section/paragraph or preferably still, a link to 'JavaDoc' that does not provide Scala examples with non-functional commented code).
The reason you're seeing different list addresses is because Spark doesn't execute foreachPartition locally on the driver, it has to serialize the function and send it over the Executor handling the processing of the partition. You have to remember that although working with the code feels like everything runs in a single location, the calculation is actually distributed.
The first problem I see with you code has to do with your reduceByKey which takes two byte arrays and returns the first, is that really what you want to do? That means you're effectively dropping parts of the data, perhaps you're looking for combineByKey which will allow you to return a JavaPairDStream<String, List<byte[]>.
Regarding parsing of your protobuf, looks to me like you don't want foreachRDD, you need an additional map to parse the data:
kafkaStream
.combineByKey(/* implement logic */)
.flatMap(x -> x._2)
.map(proto -> MyThing.parseFrom(proto))
.map(myThing -> myCalc.doStuff(myThing))
.foreachRDD(/* After all the processing, do stuff with result */)

Categories