RxJava - Cache Observable Updates and Emit Largest Values

RxJava - Cache Observable Updates and Emit Largest Values - java

I currently have an Observable<ProductIDUpdate> emitting an object that represents an update of a product ID. The update can either be the ID is an new ADDITION or has expired and requires DELETION.
public class ProductIDUpdate {
enum UpdateType {
ADDITION, DELETEION;
}
private int id;
private UpdateType type;
public ProductIDUpdate(int id) {
this(id, UpdateType.ADDITION);
}
public ProductIDUpdate(int id, UpdateType type) {
this.id = id;
this.type = type;
}
}
I want to track the update with the largest ID value, hence I want to modify the stream so that the current highest ID is emitted. How would I cache the update items in the stream such that if the current highest ID is deleted, the next highest available ID is emitted?

I don't know anything about Rx, but here's my understanding:
you have a bunch of product ids. It's not clear to me whether you receive them over time as part of some messages being sent to your class or if you know all the ids from the beginning
you want to create a stream on top of your source of product ids that emits the highest available id at any point in time
If my understanding is correct, how about using a PriorityQueue? You cache ids in the queue with a reverse comparator (it keeps the smallest element at the top of the heap by default) and when you want to emit a new value you just pop the top value.

Can something like that meet your requirements?
public static void main(String[] args) {
Observable<ProductIDUpdate> products =
Observable.just(new ProductIDUpdate(1, ADDITION),
new ProductIDUpdate(4, ADDITION),
new ProductIDUpdate(2, ADDITION),
new ProductIDUpdate(5, ADDITION),
new ProductIDUpdate(1, DELETION),
new ProductIDUpdate(5, DELETION),
new ProductIDUpdate(3, ADDITION),
new ProductIDUpdate(6, ADDITION));
products.distinctUntilChanged((prev, current) -> prev.getId() > current.getId())
.filter(p -> p.getType().equals(ADDITION))
.subscribe(System.out::println,
Throwable::printStackTrace);
Observable.timer(1, MINUTES) // just for blocking the main thread
.toBlocking()
.subscribe();
}
This prints:
ProductIDUpdate{id=1, type=ADDITION}
ProductIDUpdate{id=4, type=ADDITION}
ProductIDUpdate{id=5, type=ADDITION}
ProductIDUpdate{id=6, type=ADDITION}
If you remove the filter(), this prints:
ProductIDUpdate{id=1, type=ADDITION}
ProductIDUpdate{id=4, type=ADDITION}
ProductIDUpdate{id=5, type=ADDITION}
ProductIDUpdate{id=5, type=DELETION}
ProductIDUpdate{id=6, type=ADDITION}

Related

Which Data Structure would be more suitable for the following task in Java?

Every 5 minutes, within the 20th minute cycle, I need to retrieve the data. Currently, I'm using the map data structure.
Is there a better data structure? Every time I read and set the data, I have to write to the file to prevent program restart and data loss.
For example, if the initial data in the map is:
{-1:"result1",-2:"result2",-3:"result3",-4:"result4"}
I want to get the last -4 period's value which is "result4", and set the new value "result5", so that the updated map will be:
{-1:"result5",-2:"result1",-3:"result2",-4:"result3"}
And again, I want to get the last -4 period's value which is "result3", and set the new value "result6", so the map will be:
{-1:"result6",-2:"result5",-3:"result1",-4:"result2"}
The code:
private static String getAndSaveValue(int a) {
//read the map from file
HashMap<Long,String> resultMap=getMapFromFile();
String value=resultMap.get(-4L);
for (Long i = 4L; i >= 2; i--){
resultMap.put(Long.parseLong(String.valueOf(i - 2 * i)),resultMap.get(1 - i));
}
resultMap.put(-1L,"result" + a);
//save the map to file
saveMapToFile(resultMap);
return value;
}

Based on your requirement, I think LinkedList data structure will be suitable for your requirement:
public class Test {
public static void main(String[] args) {
LinkedList<String> ls=new LinkedList<String>();
ls.push("result4");
ls.push("result3");
ls.push("result2");
ls.push("result1");
System.out.println(ls);
ls.push("result5"); //pushing new value
System.out.println("Last value:"+ls.pollLast()); //this will return `result4`
System.out.println(ls);
ls.push("result6"); //pushing new value
System.out.println("Last value:"+ls.pollLast()); // this will give you `result3`
System.out.println(ls);
}
}
Output:
[result1, result2, result3, result4]
Last value:result4
[result5, result1, result2, result3]
Last value:result3
[result6, result5, result1, result2]

Judging by your example, you need a FIFO data structure which has a bounded size.
There's no bounded general purpose implementation of the Queue interface in the JDK. Only concurrent implementation could be bounded in size. But if you're not going to use it in a multithreaded environment, it's not the best choice because thread safety doesn't come for free - concurrent collections are slower, and also can create confusing for the reader of your code.
To achieve your goal, I suggest you to use the composition by wrapping ArrayDeque, which is an array-based implementation of the Queue and performs way better than LinkedList.
Note that is a preferred approach not to extend ArrayDeque (IS A relationship) and override its methods add() and offer(), but include it in a class as a field (HAS A relationship), so that all the method calls on the instance of your class will be forwarded to the underlying collection. You can find more information regarding this approach in the item "Favor composition over inheritance" of Effective Java by Joshua Bloch.
public class BoundQueue<T> {
private Queue<T> queue;
private int limit;
public BoundQueue(int limit) {
this.queue = new ArrayDeque<>(limit);
this.limit = limit;
}
public void offer(T item) {
if (queue.size() == limit) {
queue.poll(); // or throw new IllegalStateException() depending on your needs
}
queue.add(item);
}
public T poll() {
return queue.poll();
}
public boolean isEmpty() {
return queue.isEmpty();
}
}

Java Spring / Reactor send / update flux when underlying database changed

I have a data for candidate "likes", which I'd like to send to client every time the "like" number is changed. I think this is achievable using Spring Flux? But I can't find any example for this. Most flux example is based on specific interval (e.g. every second). This might be a waste, because the transaction is not that much, and a candidate might not get likes in many minutes.
I just want to create dashboard that subscribe to "likes" change, and get updated when certain candidate "likes" number changed.
What is the way to get this?
This is what I did, and it works, but it based on interval (5 seconds), not based on data change.
public Flux<Candidate> subscribeItemChange(String id) {
return Flux.interval(Duration.ofSeconds(5)).map(t -> candidateService.getCandidateDetail(id));
}
The candidateService.getCandidateDetail is basically query database for certain id, so this is more like polling instead of "update on change".
I think I must put something on candidateService.updateLikes() below, but what should I update?
public class CandidateService {
public Candidate getCandidateDetail(String id) {
// query candidate from database
// select * from candidates where id = :id
// and return it
}
public void updateLikes(String id, int likesCount) {
// update candidates set likes_count = :likesCount where id = :id
// ...
// I think I need to write something here, but what?
}
}

You could make use of a dynamic sink, adding a field similar to:
private Sinks.Many<Candidate> likesSink = Sinks.many().multicast().onBackpressureBuffer();
...then you can:
Use sink.tryEmitNext in your updateLikes() method to publish to the sink whenever likes are updated for a candidate;
Implement your subscribeItemChange() method which uses likesSink.asFlux(), which can then be filtered if necessary to only return the stream of "like updates" for a particular candidate.

Based on #Michael Berry guide.
public void updateLikes(String id, int likesCount) {
Candidate c = getCandidateDetail(id);
c.setLikesCount(likesCount);
CandidateDummyDatasource.likesSink.tryEmitNext(c);
}
On subscriber
public Flux<Candidate> subscribeItemChange(String id) {
return CandidateDummyDatasource.likesSink.asFlux()
.filter(c -> c.getId().equals(id))
.map(data -> candidateService.getCandidateDetail(id));
}

Using CompletableFuture in a Loop with Two Futures to Merge per Loop Iteration

I have code like the following:
testMethod(List<String> ids) {
List<CompletableFuture<ResultThree>> resultThreeList = new ArrayList<>();
for(String id : ids) {
CompletableFuture<ResultOne> resultOne = AynchOne(id);
CompletableFuture<ResultTwo> resultTwo = AynchTwo(id);
CompletableFuture<ResultThree> resultThree = resultOne.thenCombine(resultTwo, (ResultOne a, ResultTwo b) -> computeCombinedResultThree(a, b));
resultThreeList.add(resultThree);
}
// PROCESS RESULTS HERE
}
class ResultOne {
boolean goodResult;
String id;
ResultOne(String promId) {
this.goodResult = true;
this.id = promId;
}
}
class ResultTwo {
boolean goodResult;
String id;
ResultTwo(String promId) {
this.goodResult = true;
this.id = promId;
}
class ResultThree() {
boolean goodResult;
String = id;
}
private ResultThree computeCombinedResultThree(ResultOne r1, ResultTwo r2) {
ResultThree resultThree = new ResultThree();
resultThree.id = r1.id;
resultThree.goodResult = r1.goodResult && r2.goodResult;
return resultThree;
}
, I need to be able to AND the results resultOne and resultTwo together, such that for each iteration, on the completion of the entire synchronous execution, I have an (I guess) array or map that I can subsequently process, where one object in the array has the corresponding id and a true or false for that id (that represents the AND-ing of the two booleans from the separate objects.
Based on feedback from readers, I have gotten the code completed to the point where I can merge the two original futures, and combine all the results from each iteration to get the entire loop of futures. At this point I just need to process the results.
I think maybe I need another CompletableFuture? This one would maybe be something like this (put above where I have "// PROCESS RESULTS HERE"):
CompletableFuture<Void> future = resultThreeList
.thenRun(() -> forwardSuccesses(resultThreeList));
future.get();
forwardSuccesses() would iterate through resultThreeList forwarding the successful ids to another process, but not sue that is how to do it.
Grateful for any ideas. Thanks.

So this is how far you got until now:
List<CompletableFuture<ResultThree>> resultThreeList = new ArrayList<>(ids.size());
for (String id : ids) {
CompletableFuture<ResultOne> resultOne = aynchOne(id);
CompletableFuture<ResultTwo> resultTwo = aynchTwo(id);
CompletableFuture<ResultThree> resultThree = resultOne.thenCombine(resultTwo, this::computeCombinedResultThree);
resultThreeList.add(resultThree);
}
Now all you need to do is convert this List<CompletableFuture<ResultThree>> to a CompletableFuture<List<ResultThree>> that will get completed once all the results are finished calculating.
CompletableFuture<List<ResultThree>> combinedCompletables =
CompletableFuture.allOf(resultThreeList.toArray(new CompletableFuture<?>[0]))
.thenApply(v -> resultThreeList.stream()
.map(CompletableFuture::join)
.collect(Collectors.toList())
);
Or with something like
CompletableFuture<List<ResultThree>> combinedCompletables =
CompletableFuture.supplyAsync(() -> resultThreeList.stream().map(this::safeGet).collect(Collectors.toList()));
where safeGet is a method that just calls future.get() and catches the exceptions that may occur - you can't just call get() in a lambda because of those exceptions.
Now you can process this list with thenAccept():
try {
combinedCompletables.thenAccept(this::forwardSuccesses).get(30, TimeUnit.SECONDS);
} catch (InterruptedException | ExecutionException | TimeoutException e) {
e.printStackTrace();
}
Again, the exceptions being caught are due to the call to get().
Side note, I don't really see why there are three result classes since all you need - for this part of the code at least - is the id and the result status. I'd introduce an interface (Result?) for that and only work on that.

It seems to me you don't need 3 different ResultOne ResultTwo ResultThree classes as they define the same type, so I shall replace them for Result.
Assuming you want to forward only successes, I added a short isGoodResult() method to the Result class to be used as predicate with the streams:
class Result {
public boolean goodResult;
public String id;
// ...
public boolean isGoodResult() {
return this.goodResult;
}
}
I'd also recommend getting rid of the loop, replacing it for a stream to make your code more fluid.
Should forwardSuccess be strict, accepting List<Result>, this is how I'd implement testMethod:
void testMethod(List<String> ids) {
final List<Result> results = ids.stream()
.parallel()
.map(id -> asynchOne(id).thenCombine(
asynchTwo(id),
(r1, r2) -> computeCombinedResult(r1, r2)))
.map(CompletableFuture::join)
.filter(Result::isGoodResult)
.collect(Collectors.toList());
// PROCESS RESULTS HERE
forwardSuccesses(results);
}
Should forwardSuccess be lazy, accepting CompletableFuture<List<Result>>:
void testMethod(List<String> ids) {
final List<CompletableFuture<Result>> futures = ids.stream()
.parallel()
.map(id -> asynchOne(id).thenCombine(
asynchTwo(id),
(r1, r2) -> computeCombinedResult(r1, r2)))
.collect(Collectors.toList());
final CompletableFuture<List<Result>> asyncResults =
CompletableFuture.allOf(futures.stream().toArray(CompletableFuture[]::new))
.thenApply(__ -> futures
.stream()
.map(CompletableFuture::join)
.filter(Result::isGoodResult)
.collect(Collectors.toList()));
// PROCESS RESULTS HERE
forwardSuccessesAsync(asyncResults);
}

Within the for loop you get immediately the CompletableFutures in return. In the background some magic happens and you want to wait until both are complete.
So after both CompletableFutures were returned, cause a blocking wait by invoking CompletableFuture.get with a long value and a time unit. If you only invoke get without any parameters you'll wait forever.
Choose your timeout and JDK wisely. It might happen, that JDK 8 doesn't provide a get with timeout. Also JDK 8 isn't supported anymore. JDK 11 is now long term support and recent compilers don't offer JDK 8 as a target anymore.
I really urge you to read the dirty details about CompletableFuture and how it differs from Future, esp. regarding thread control like cancellation. Not knowing the underlying provider of CompletableFuture I also assume querying one ID is waste of ressources and throughput is very limited. But this is a separate question.

Java Reactive stream how to map an object when the object being mapped is also needed on the next step of the stream

I am using Java 11 and project Reactor (from Spring). I need to make a http call to a rest api (I can only make it once in the whole flow).
With the response I need to compute two things:
Check if a document exists in the database (mongodb). If it does not exists then create it and return it. Otherwise just return it.
Compute some logic on the response and we are done.
In pseudo code it is something like this:
public void computeData(String id) {
httpClient.getData(id) // Returns a Mono<Data>
.flatMap(data -> getDocument(data.getDocumenId()))
// Issue here is we need access to the data object consumed in the previous flatMap but at the same time we also need the document object we get from the previous flatMap
.flatMap(document -> calculateValue(document, data))
.subscribe();
}
public Mono<Document> getDocument(String id) {
// Check if document exists
// If not create document
return document;
}
public Mono<Value> calculateValue(Document doc, Data data) {
// Do something...
return value;
}
The issue is that calculateValue needs the return value from http.getData but this was already consumed on the first flatMap but we also need the document object we get from the previous flatMap.
I tried to solve this issue using Mono.zip like below:
public void computeData(String id) {
final Mono<Data> dataMono = httpClient.getData(id);
Mono.zip(
new Mono<Mono<Document>>() {
#Override
public void subscribe(CoreSubscriber<? super Mono<Document>> actual) {
final Mono<Document> documentMono = dataMono.flatMap(data -> getDocument(data.getDocumentId()))
actual.onNext(documentMono);
}
},
new Mono<Mono<Value>>() {
#Override
public void subscribe(CoreSubscriber<? super Mono<Value>> actual) {
actual.onNext(dataMono);
}
}
)
.flatMap(objects -> {
final Mono<Document> documentMono = objects.getT1();
final Mono<Data> dataMono = objects.getT2();
return Mono.zip(documentMono, dataMono, (document, data) -> calculateValue(document, data))
})
}
But this is executing the httpClient.getData(id) twice which goes against my constrain of only calling it once. I understand why it is being executed twice (I subscribe to it twice).
Maybe my solution design can be improved somewhere but I do not see where. To me this sounds like a "normal" issue when designing reactive code but I could not find a suitable solution to it so far.
My question is, how can accomplish this flow in a reactive and non blocking way and only making one call to the rest api?
PS; I could add all the logic inside one single map but that would force me to subscribe to one of the Mono inside the map which is not recommended and I want to avoid following this approach.
EDIT regarding #caco3 comment
I need to subscribe inside the map because both getDocument and calculateValue methods return a Mono.
So, if I wanted to put all the logic inside one single map it would be something like:
public void computeData(String id) {
httpClient.getData(id)
.map(data -> getDocument(data).subscribe(s -> calculateValue(s, data)))
.subscribe();
}

You do not have to subscribe inside map, just continue building the reactive chain inside the flatMap:
getData(id) // Mono<Data>
.flatMap(data -> getDocument(data.getDocumentId()) // Mono<Document>
.switchIfEmpty(createDocument(data.getDocumentId())) // Mono<Document>
.flatMap(document -> calculateValue(document, data)) // Mono<Value>
)
.subscribe()

Boiling it down, your problem is analogous to:
Mono.just(1)
.flatMap(original -> process(original))
.flatMap(processed -> I need access to the original value and the processed value!
System.out.println(original); //Won't work
);
private static Mono<String> process(int in) {
return Mono.just(in + " is an integer").delayElement(Duration.ofSeconds(2));
}
(Silly example, I know.)
The problem is that map() (and by extension, flatMap()) are transformations - you get access to the new value, and the old one goes away. So in your second flatMap() call, you've got access to 1 is an integer, but not the original value (1.)
The solution here is to, instead of mapping to the new value, map to some kind of merged result that contains both the original and new values. Reactor provides a built in type for that - a Tuple. So editing our original example, we'd have:
Mono.just(1)
.flatMap(original -> operation(original))
.flatMap(processed -> //Help - I need access to the original value and the processed value!
System.out.println(processed.getT1()); //Original
System.out.println(processed.getT2()); //Processed
///etc.
);
private static Mono<Tuple2<Integer, String>> operation(int in) {
return Mono.just(in + " is an integer").delayElement(Duration.ofSeconds(2))
.map(newValue -> Tuples.of(in, newValue));
}
You can use the same strategy to "hold on" to both document and data - no need for inner subscribes or anything of the sort :-)

Computing statistics over a stream for a given window

I have a ticker KStream that that ticks frequently (think seconds), and I want to compute various statistics over a 24 hour window. For example, 24 hour change, the difference in price between a given point and one 24 hours before it.
My output for my desired input is:
t1 -> t1c1
t2 -> t1c2
t3 -> t1c3
Where t1 is the input ticker, and t1c1 is the input ticker with additional statistics computed for the 24 hour window preceding it.
I've considered a few ways of doing this that haven't worked:
* Window my ticker stream by size 24 hours with 1 second hops.
builder.stream(rawPriceTickerTopic, ...)
.groupByKey()
.windowedBy(
TimeWindows.of(TimeUnit.DAYS.toMillis(1))
.advanceBy(TimeUnit.SECONDS.toMillis(1))
.reduce((value1, value2) ->
value1.tickerWithStatsFrom(value2), ...)
.toStream();
However, this generates an immense number of output points, as each input ticker generates an output ticker for each window it is a member of.
Keep some kind of time series store up to date, get the the value 24 hours previous from the store, and compute my statistics ticker from that, however this seems to be going against the point of streams.

My final solution here was to abandon windowing and simply aggregate over my tickers, maintaining my own 24 hour window in the aggregator. This still doesn't feel like the best way and there's a nagging feeling that I could have solved it with Kafka's built in windowing concepts.
As said above, I use simple aggregation with my aggregator:
streamBuilder.stream(tickerTopic, Consumed.with(...)
.groupByKey()
.aggregate(MyAggregator::new,
(key, value, aggregate) -> aggregate.addTicker(value),
Materialized.with(...)
.toStream()
The result is that for every record in the original ticker stream, I get an aggregated value in my output stream. My aggregators logic is simple:
Add a new ticker to the ordered collection.
Discard any tickers that are more than 24 hours older that this new latest ticker.
Compute the new 24 hour change.
(This technique could be used for any kind of calculation over a given window, for example a moving average.)
Sample code for the aggregator:
public class MyAggregator {
private BigDecimal change;
private TreeSet<Ticker> orderedTickers = new TreeSet<>(MyAggregator::tickerTimeComparator);
public MyAggregator () {
this.windowMilis = 86400000;
}
public MyAggregator addTicker(Ticker ticker) {
orderedTickers.add(ticker);
cleanOldTickers();
change = getLatest().getAsk().subtract(getEarliest().getAsk());
return this;
}
public BigDecimal getChange() {
return change;
}
public Ticker getEarliest() {
return orderedTickers.first();
}
public Ticker getLatest() {
return orderedTickers.last();
}
private void cleanOldTickers() {
Date endOfWindow = latestWindow();
Iterator<Ticker> iterator = orderedTickers.iterator();
while(iterator.hasNext()) {
Ticker next = iterator.next();
if (next.getTimestamp().before(endOfWindow)) {
iterator.remove();
}
// The collection is sorted by time so if we get here we can break.
break;
}
}
private Date latestWindow() {
return new Date(getLatest().getTimestamp().getTime() - windowMilis);
}
private static int tickerTimeComparator(Ticker t1, Ticker t2) {
return t1.getTimestamp().compareTo(t2.getTimestamp());
}
}

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

RxJava - Cache Observable Updates and Emit Largest Values - java

Related

Which Data Structure would be more suitable for the following task in Java?

Java Spring / Reactor send / update flux when underlying database changed

Using CompletableFuture in a Loop with Two Futures to Merge per Loop Iteration

Java Reactive stream how to map an object when the object being mapped is also needed on the next step of the stream

Computing statistics over a stream for a given window

Categories

Resources