How to calculate the processing time in rx - java

For the following flow, I am wondering how I can calculate the time it takes to process all the data in forEach(...).
Observable
.from(1,2,3)
.flatMap(it - {})
.toBlocking()
.forEarch(it -> {//some paring logic here})
EDIT
After reading this tutorial: Leaving the Monad, I feel the simple solution would be to do the following. Let me know if I missed something
List items = Observable
.from(1,2,3)
.flatMap(it - {})
.toList();
long startTime = System.currentTimeMillis();
for(Object it : items)
{
//some parsing here
}
long processingTime = System.currentTimeMillis() - startTime

One option is to create an Observable which will output the timings. You can do this by wrapping your computation with Observable#using:
public class TimerExample {
public static void main(String[] args) {
final PublishSubject<Long> timings = PublishSubject.create();
final Observable<List<Integer>> list = Observable
.just(1, 2, 3)
.flatMap(TimerExample::longRunningComputation)
.toList();
final Observable<List<Integer>> timed
= Observable.using(() -> new Timer(timings), (t) -> list, Timer::time);
timings.subscribe(time -> System.out.println("Time: " + time + "ms"));
List<Integer> ints = timed.toBlocking().last();
System.out.println("ints: " + Joiner.on(", ").join(ints));
ints = timed.toBlocking().last();
System.out.println("ints: " + Joiner.on(", ").join(ints));
}
private static Observable<Integer> longRunningComputation(Integer i) {
return Observable.timer(1, TimeUnit.SECONDS).map(ignored -> i);
}
public static class Timer {
private final long startTime;
private final Observer<Long> timings;
public Timer(Observer<Long> timings) {
this.startTime = System.currentTimeMillis();
this.timings = timings;
}
public void time() {
timings.onNext(System.currentTimeMillis() - startTime);
}
}
}
The timings are in this case print to the console, but you can do with them as you please:
Time: 1089ms
ints: 2, 1, 3
Time: 1003ms
ints: 1, 3, 2

I think this what you want, from your code I split the production of values Observable.range (that should match the Observable.just in your sample) and the pipeline to measure, in this case I added some fake computation.
The idea is to wrap the pipeline you want to measure in a flatmap and add a stopwatch in a single flatmap.
Observable.range(1, 10_000)
.nest()
.flatMap(
o -> {
Observable<Integer> pipelineToMeasure = o.flatMap(i -> {
Random random = new Random(73);
try {
TimeUnit.MILLISECONDS.sleep(random.nextInt(5));
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
}
return Observable.just(i);
});
Stopwatch measure = Stopwatch.createUnstarted();
return pipelineToMeasure
.doOnSubscribe(measure::start)
.doOnTerminate(() -> {
measure.stop();
System.out.println(measure);
});
}
)
.toBlocking()
.forEach(System.out::println);
Just to avoid confusion I used nest to avoid recreating myself the Observable in the outer flatmap.
Also I'm using the Stopwatch of the Guava library.
To give more information, here's a possible code to measure in the forEach statement when blocking.
MeasurableAction1<Integer> measuring = MeasurableAction1.measure(System.out::println);
Observable
.just(1, 2, 3)
.flatMap(Observable::just)
.toBlocking()
.forEach(measuring.start());
measuring.stop().elapsed(TimeUnit.SECONDS);
And the measuring class :
private static class MeasurableAction1<T> implements Action1<T> {
private Stopwatch measure = Stopwatch.createUnstarted();
private Action1<? super T> action;
public MeasurableAction1(Action1<? super T> action) {
this.action = action;
}
#Override
public void call(T t) {
action.call(t);
}
public MeasurableAction1<T> start() {
measure.start();
return this;
}
public MeasurableAction1<T> stop() {
measure.stop();
return this;
}
public long elapsed(TimeUnit desiredUnit) {
return measure.elapsed(desiredUnit);
}
public static <T> MeasurableAction1<T> measure(Action1<? super T> action) {
return new MeasurableAction1<>(action);
}
}
And better without blocking with a subscriber, note that .subscribe offer more options that the .forEach alias (either when blocking or not):
Observable
.just(1, 2, 3)
.flatMap(Observable::just)
.subscribe(MeasuringSubscriber.measuringSubscriber(
System.out::println,
System.out::println,
System.out::println
));
And subscriber :
private static class MeasuringSubscriber<T> extends Subscriber<T> {
private Stopwatch measure = Stopwatch.createUnstarted();
private Action1<? super T> onNext;
private final Action1<Throwable> onError;
private final Action0 onComplete;
public MeasuringSubscriber(Action1<? super T> onNext, Action1<Throwable> onError, Action0 onComplete) {
this.onNext = onNext;
this.onError = onError;
this.onComplete = onComplete;
}
#Override
public void onCompleted() {
try {
onComplete.call();
} finally {
stopAndPrintMeasure();
}
}
#Override
public void onError(Throwable e) {
try {
onError.call(e);
} finally {
stopAndPrintMeasure();
}
}
#Override
public void onNext(T item) {
onNext.call(item);
}
#Override
public void onStart() {
measure.start();
super.onStart();
}
private void stopAndPrintMeasure() {
measure.stop();
System.out.println("took " + measure);
}
private static <T> MeasuringSubscriber<T> measuringSubscriber(final Action1<? super T> onNext, final Action1<Throwable> onError, final Action0 onComplete) {
return new MeasuringSubscriber<>(onNext, onError, onComplete);
}
}

Related

Flink: getRecord() not getting called in AggregateFunction

I am trying to create a TumblingWindow on a stream of continuous data and create aggregates within the window. But for some reason, the getResult() does not get called.
public class MyAggregator implements AggregateFunction<Event, MyMetrics, MyMetrics> {
#Override
public MyMetrics createAccumulator() {
return new MyMetrics(0L, 0L);
}
#Override
public MyMetrics add(Event value, MyMetrics accumulator) {
Instant previousValue = ...;
if (previousValue != null) {
Long myWay = ...;
accumulator.setMyWay(myWay);
}
return accumulator;
}
#Override
public MyMetrics getResult(MyMetrics accumulator) {
System.out.println("Inside getResult()");
return accumulator;
}
#Override
public MyMetrics merge(MyMetrics acc1, MyMetrics acc2) {
return new MyMetrics(
acc1.getMyWay() + acc2.getMyWay());
}
}
Note: event.getClientTime() returns an Instant object.
private WatermarkStrategy getWatermarkStrategy() {
return WatermarkStrategy
.<MyEvent>forBoundedOutOfOrderness(Duration.ofMinutes(10))
.withTimestampAssigner(
(event, timestamp) ->
event.getClientTime().toEpochMilli()
);
}
public static void main(String[] args) {
DataStream<MyEvent> watermarkedData = actuals
.assignTimestampsAndWatermarks(
getWatermarkStrategy()
).name("addWatermark");
final OutputTag<MyEvent> lateOutputTag = new OutputTag<MyEvent>("late-data"){};
SingleOutputStreamOperator<OutputModel> output_data = watermarkedData
.keyBy("input_key")
.window(TumblingEventTimeWindows.of(Time.hours(1)))
.sideOutputLateData(lateOutputTag)
.aggregate(
new MyAggregator(),
).name("AggregationRollUp");
output_data.addSink(new PrintSinkFunction<>());
}
Any pointers as to what I am missing here would be helpful.
First check the timing of the data to see if it meets the window trigger conditions
Second may be you can do a test by reducing the window size from 1h to 1min and reducing the watermark region from 10min to 30s

Use TumblingWindow on join operation but no element is transported to my JoinFunction

public class FlinkWindowTest {
public static long timestamp = 1496301598L;
public static void main(String[] args) throws Exception {
// get the execution environment
StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
// get input data by connecting to the socket
SourceFunction<String> out = new OutSource();
DataStream<String> text = env.addSource(out);
// parse the data
DataStream<WordWithCount> windowCounts = text
.flatMap(new FlatMapFunction<String, WordWithCount>() {
public void flatMap(String value, Collector<WordWithCount> out) {
for (String word : value.split(" ")) {
out.collect(new WordWithCount(word, 1L));
}
}
});
//assign timestamp
windowCounts = windowCounts.assignTimestampsAndWatermarks(new MyTimestampExtractor(Time.seconds(0)));
windowCounts.keyBy(new MyKeySelector())
.join(windowCounts)
.where(new MyKeySelector()).equalTo(new MyKeySelector())
.window(TumblingEventTimeWindows.of(Time.seconds(10)))
.apply(new JoinFunction<WordWithCount, WordWithCount, Object>() {
public Object join(WordWithCount wordWithCount, WordWithCount wordWithCount2) throws Exception {
System.out.println("start join");
System.out.println(wordWithCount.toString());
System.out.println(wordWithCount2.toString());
WordWithCount wordWithCount3 = new WordWithCount(wordWithCount.word, wordWithCount.count + wordWithCount2.count);
System.out.println(wordWithCount3.toString());
return wordWithCount3;
}
});
env.execute("Window WordCount");
}
public static class MyKeySelector implements KeySelector<WordWithCount, String> {
public String getKey (WordWithCount wordWithCount) throws Exception {
return wordWithCount.word;
}
}
public static class MyTimestampExtractor extends BoundedOutOfOrdernessTimestampExtractor<WordWithCount> {
public MyTimestampExtractor(Time maxOutOfOrderness) {
super(maxOutOfOrderness);
}
public long extractTimestamp(WordWithCount wordWithCount) {
return wordWithCount.getTimeStamp();
}
}
public static class OutSource implements SourceFunction<String> {
private String[] str = {
"aa ff","bb gg","cc hh","dd kk"
};
public void run(SourceContext<String> sourceContext) throws Exception {
int index =0;
while (true) {
if(index == str.length)
index = 0;
sourceContext.collect(str[index]);
index++;
}
}
public void cancel() {
}
}
// Data type for words with count and timestamp
public static class WordWithCount {
public String word;
public long count;
public WordWithCount() {}
public long getTimeStamp() {
return timestamp;
}
public WordWithCount(String word, long count) {
this.word = word;
this.count = count;
++timestamp;
}
#Override
public String toString() {
return word + " : " + count;
}
}
}
This class is a demo. I create a SourceFunction to emit strings, then cut them to words. Finally I use join operation to join the stream itself. I don't care the count result.
The question is that there is no output in my JoinFunction class. I think the output should be
start join
aa : 1
aa : 1
aa : 2
start join
........
but now there is no output, because elements are in the window and not emitted to the join function.
I don't have ideas about this situation. If there is anyone have advice, please tell me here. I expect replies by all.
:)
You forgot to the set the time characteristic to event time:
env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime);

Java 8 streams groupby and count multiple properties

I have an object Process that has a date and a boolean error indicator. I want to get a count of total processes and a count of processes with errors for each date. So for example Jun 01 will have counts 2, 1; Jun 02 will have 1, 0 and Jun 03 1, 1. The only way I have been able to do this is streaming twice to get the counts. I have tried implementing a custom collector but haven't been successful. Is there an elegant solution instead of my kludgy method?
final SimpleDateFormat sdf = new SimpleDateFormat("yyyy-MM-dd");
final List<Process> processes = new ArrayList<>();
processes.add(new Process(sdf.parse("2016-06-01"), false));
processes.add(new Process(sdf.parse("2016-06-01"), true));
processes.add(new Process(sdf.parse("2016-06-02"), false));
processes.add(new Process(sdf.parse("2016-06-03"), true));
System.out.println(processes.stream()
.collect(
Collectors.groupingBy(Process::getDate, Collectors.counting()) ));
System.out.println(processes.stream().filter(order -> order.isHasError())
.collect(
Collectors.groupingBy(Process::getDate, Collectors.counting()) ));
private class Process {
private Date date;
private boolean hasError;
public Process(Date date, boolean hasError) {
this.date = date;
this.hasError = hasError;
}
public Date getDate() {
return date;
}
public boolean isHasError() {
return hasError;
}
}
Code after #glee8e's solution and #Holger's tips
Collector<Process, Result, Result> ProcessCollector = Collector.of(
() -> Result::new,
(r, p) -> {
r.increment(0);
if (p.isHasError()) {
r.increment(1);
}
}, (r1, r2) -> {
r1.add(0, r2.get(0));
r1.add(1, r2.get(1));
return r1;
});
Map<Date, Result> results = Processs.stream().collect(groupingBy(Process::getDate, ProcessCollector));
results.entrySet().stream().sorted(Comparator.comparing(Entry::getKey)).forEach(entry -> System.out
.println(String.format("date = %s, %s", sdf.format(entry.getKey()), entry.getValue())));
private class Result {
private AtomicIntegerArray array = new AtomicIntegerArray(2);
public int get(int index) {
return array.get(index);
}
public void increment(int index) {
array.getAndIncrement(index);
}
public void add(int index, int delta) {
array.addAndGet(index, delta);
}
#Override
public String toString() {
return String.format("totalProcesses = %d, totalErrors = %d", array.get(0), array.get(1));
}
}
It is preferable that we add a POJO to store the result, or the combiner function may looks a bit obscure. I declared the POJO as public, but you can change it if you think it better to hide it.
public class Result {
public int all, error;
}
Main code:
// Add it somewhere in this file.
private static final Set <Characteristics> CH_ID = Collections.unmodifiableSet(EnumSet.of(Collector.Characteristics.IDENTITY_FINISH));
//...
// This is main processing code
processes.stream().collect(collectingAndThen(groupingBy(Process::getDate, new Collector<Process, Result, Result> {
#Override
public Supplier<Result> supplier() {
return Result::new;
}
#Override
public BiConsumer<Process, Result> accumlator() {
return (p, r) -> {
r.total++;
if (p.isHasError())
r.error++;
};
}
#Override
public BinaryOperator<Result> combiner() {
return (r1, r2) -> {
r1.total += r2.total;
r1.error += r2.error;
return r1;
};
}
#Override
public Function<Result, Result> finisher() {
return Function.identity();
}
#Override
public Set<Characteristics> characteristics() {
return CH_ID;
}
})));
PS: I assume you have import static java.util.stream.Collectors

Supplying call latency as a IntStream

I am trying to make use of Java 8 and streams and one of the things I am trying to replace is a system we have where we
Use an aspect to measure call latency (per config period of time) to out webservices and then
Feed those results into a Complex Event Processor (esper) so that
We can send out alert notifications
So, one step at a time. For the first step, I need to produce a stream (I think) that allows me to feed those latency numbers into existing listeners. Understanding that, getting the next number in series might have to wait until there is a call.
How can I do that? Here is the latency aspect with comments.
public class ProfilingAspect {
private ProfilingAction action;
public ProfilingAspect(ProfilingAction action) {
this.action = action;
}
public Object doAroundAdvice(ProceedingJoinPoint jp) throws Throwable{
long startTime = System.currentTimeMillis();
Object retVal = null;
Throwable error = null;
try{
retVal = jp.proceed();
}catch (Throwable t){
error = t;
}
Class withinType = jp.getSourceLocation().getWithinType();
String methodName = jp.getSignature().getName();
long endTime = System.currentTimeMillis();
long runningTime = endTime - startTime;
// Let the IntStream know we have a new latency. Or really, we have an object
// stream with all this extra data
action.perform(withinType, methodName, jp.getArgs(), runningTime, error);
if( error != null ){
throw error;
}
return retVal;
}
}
Ok, I have a working example. It doesn't handle the situation where I have to buffer up results though is the stream isn't being read fast enough. I am open to some improvement
public class LatencySupplier implements Supplier<SomeFancyObject> {
private Random r = new Random();
#Override
public SomeFancyObject get() {
try {
Thread.sleep(100 + r.nextInt(1000));
} catch (InterruptedException e) {
throw new RuntimeException(e);
}
return new SomeFancyObject(10 + r.nextInt(1000));
}
}
public class SomeFancyObject {
private static String[] someGroups = {"Group1","Group2","Group3"};
private final String group;
private int value;
public SomeFancyObject(int value) {
this.value = value;
this.group = WSRandom.selectOne(someGroups);
}
public String getGroup() {
return group;
}
public int getValue() {
return value;
}
#Override
public String toString() {
return value + "";
}
}
My next step is to create a stream by time so I can do avg/5 min, etc.
public class Sample {
public static void main(String[] args) throws InterruptedException {
Stream<SomeFancyObject> latencyStream = Stream.generate(new LatencySupplier());
Map<Object,List<SomeFancyObject>> collect = latencyStream.limit(10).collect(Collectors.groupingBy(sfo -> sfo.getGroup()));
System.out.println(collect);
Object o = new Object();
synchronized (o){
o.wait();
}
}
}

EssentialFilter play framework java (logging time)

Im Trying to create a time logger filter to monitor the time my requests are taking in play framework 2 using java, however the documentation on the java side of Filters are weak.
Can anyone point me in the right direction on how to achieve this?
The scala guide is found at
http://www.playframework.com/documentation/2.1.3/ScalaHttpFilters
So you would start with Filter like this:
public class TimeLoggingFilter implements EssentialFilter {
public EssentialAction apply(final EssentialAction next) {
return new TimeLoggingAction() {
#Override
public EssentialAction apply() {
return next.apply();
}
#Override
public Iteratee<byte[], SimpleResult> apply(final RequestHeader rh) {
final long startTime = System.currentTimeMillis();
return next.apply(rh).map(new AbstractFunction1<SimpleResult, SimpleResult>() {
#Override
public SimpleResult apply(SimpleResult v1) {
long time = logTime(rh, startTime);
List<Tuple2<String, String>> list = new ArrayList<Tuple2<String, String>>();
Tuple2<String, String> t =
new Tuple2<String, String>("Request-Time",
String.valueOf(time));
list.add(t);
Seq<Tuple2<String, String>> seq = Scala.toSeq(list);
return v1.withHeaders(seq);
}
#Override
public <A> Function1<SimpleResult, A> andThen(Function1<SimpleResult, A> g) {
return g;
}
#Override
public <A> Function1<A, SimpleResult> compose(Function1<A, SimpleResult> g) {
return g;
}
}, Execution.defaultExecutionContext());
}
private long logTime(RequestHeader request, long startTime) {
long endTime = System.currentTimeMillis();
long requestTime = endTime - startTime;
Logger.info(request.uri() + " from " + request.remoteAddress() + " took " + requestTime + " ms");
return requestTime;
}
};
}
public abstract class TimeLoggingAction extends
AbstractFunction1<RequestHeader, Iteratee<byte[], SimpleResult>>
implements EssentialAction {}
}
and then hook it up in your Global.java:
public <T extends EssentialFilter> Class<T>[] filters() {
return new Class[] { TimeLoggingFilter.class };
}
I was looking for a similar example today myself, and didn't find anything - but this appears to work.

Categories