Group by object property in java flux

Group by object property in java flux - java

Given the following data structure Data and Flux<Data> what is idiomatic way to achieve grouping into series of lists based on some property:
import org.reactivestreams.Publisher;
import reactor.core.publisher.Flux;
import reactor.core.publisher.Mono;
import java.util.List;
import java.util.Objects;
import java.util.concurrent.atomic.AtomicReference;
import java.util.function.Function;
import java.util.function.Predicate;
class Scratch {
private static class Data {
private Integer key;
private String value;
public Data(Integer key, String value) {
this.key = key;
this.value = value;
}
public Integer getKey() {
return key;
}
public String getValue() {
return value;
}
public static Data of(Integer key, String value) {
return new Data(key, value);
}
#Override
public String toString() {
return value;
}
}
public static void main(String[] args) {
Flux<Data> test = Flux.just(
Data.of(1, "Hello"),
Data.of(1, "world"),
Data.of(2, "How"),
Data.of(2, "are"),
Data.of(2, "you"),
Data.of(3, "Bye"));
test.bufferUntil(new Predicate<Data>() {
Integer prev = null;
#Override
public boolean test(Data next) {
boolean collect = prev != null && !Objects.equals(prev, next.getKey());
prev = next.getKey();
return collect;
}
}, true).subscribe(e -> System.out.println(e.toString()));
}
}
Output:
[Hello, world]
[How, are, you]
[Bye]
I am aware of groupBy function on Flux, but this gives me again a Flux, not a list. Current solution I have described above works, but it does not feel 100% idiomatic because I had to use anonymous class instead of lambda. I could have use lambda and AtomicReference outside from lambda, but that too does not feel 100% right. Any suggestions?

You can also use collectMultimap which allows you to have Map<K, Collection<T>. In this case collectMultimap will return: Mono<Map<Integer,Collection<Data>>>:
test.collectMultimap( Data::getKey )
.subscribe( dataByKey -> System.out.println( dataByKey.toString() ) );
Output:
{1=[Hello, world], 2=[How, are, you], 3=[Bye]}

Here is a solution using groupBy operator. I have grouped the data by the common key. The groupBy operator gives me a Flux of GroupedFlux. GroupedFlux is a subclass of Flux, so I apply flatMap and convert an individual groupedFlux to a List<Data> using the collectList operator. Like this, I get a Flux<List<Data>>, which I then subscribe to and print, as asked by you.
test.groupBy(Data::getKey)
.flatMap(Flux::collectList)
.subscribe(listOfStringsHavingDataWithSameKey -> System.out.println(listOfStringsHavingDataWithSameKey.toString()));
Do checkout the documentations for Flux and GroupedFlux.

Related

Is there a way to convert a String to a Java type using Jackson and/or one of its associated libraries (csv, json, etc.)

Is there a mechanism to apply a standard set of checks to detect and then transform a String to the detected type, using one of Jackson's standard text related libs (csv, json, or even jackson-core)? I can imagine using it along with a label associated with that value (CSV header, for example) to do something sorta like the following:
JavaTypeAndValue typeAndValue = StringToJavaType.fromValue(Object x, String label);
typeAndValue.type() // FQN of Java type, maybe
typeAndValue.label() // where label might be a column header value, for example
typeAndValue.value() // returns Object of typeAndValue.type()
A set of 'extractors' would be required to apply the transform, and the consumer of the class would have to be aware of the 'ambiguity' of the 'Object' return type, but still capable of consuming and using the information, given its purpose.
The example I'm currently thinking about involves constructing SQL DDL or DML, like a CREATE Table statement using the information from a List derived from evaluating a row from a csv file.
After more digging, hoping to find something out there, I wrote the start of what I had in mind.
Please keep in mind that my intention here isn't to present something 'complete', as I'm sure there are several things missing here, edge cases not addressed, etc.
The pasrse(List<Map<String, String>> rows, List<String> headers comes from the idea that this could be a sample of rows from a CSV file read in from Jackson, for example.
Again, this isn't complete, so I'm not looking to pick at everything that's wrong with the following. The question isn't 'how would we write this?', it's 'is anyone familiar with something that exists that does something like the following?'.
import gms.labs.cassandra.sandbox.extractors.Extractor;
import gms.labs.cassandra.sandbox.extractors.Extractors;
import lombok.Builder;
import lombok.Getter;
import lombok.Setter;
import lombok.experimental.Accessors;
#Accessors(fluent=true, chain=true)
public class TypeAndValue
{
#Builder
TypeAndValue(Class<?> type, String rawValue){
this.type = type;
this.rawValue = rawValue;
label = "NONE";
}
#Getter
final Class<?> type;
#Getter
final String rawValue;
#Setter
#Getter
String label;
public Object value(){
return Extractors.extractorFor(this).value(rawValue);
}
static final String DEFAULT_LABEL = "NONE";
}
A simple parser, where the parse came from a context where I have a List<Map<String,String>> from a CSVReader.
import org.apache.commons.lang3.ObjectUtils;
import org.apache.commons.lang3.math.NumberUtils;
import java.util.*;
import java.util.function.BiFunction;
public class JavaTypeParser
{
public static final List<TypeAndValue> parse(List<Map<String, String>> rows, List<String> headers)
{
List<TypeAndValue> typesAndVals = new ArrayList<TypeAndValue>();
for (Map<String, String> row : rows) {
for (String header : headers) {
String val = row.get(header);
TypeAndValue typeAndValue =
// isNull, isBoolean, isNumber
isNull(val).orElse(isBoolean(val).orElse(isNumber(val).orElse(_typeAndValue.apply(String.class, val).get())));
typesAndVals.add(typeAndValue.label(header));
}
}
}
public static Optional<TypeAndValue> isNumber(String val)
{
if (!NumberUtils.isCreatable(val)) {
return Optional.empty();
} else {
return _typeAndValue.apply(NumberUtils.createNumber(val).getClass(), val);
}
}
public static Optional<TypeAndValue> isBoolean(String val)
{
boolean bool = (val.equalsIgnoreCase("true") || val.equalsIgnoreCase("false"));
if (bool) {
return _typeAndValue.apply(Boolean.class, val);
} else {
return Optional.empty();
}
}
public static Optional<TypeAndValue> isNull(String val){
if(Objects.isNull(val) || val.equals("null")){
return _typeAndValue.apply(ObjectUtils.Null.class,val);
}
else{
return Optional.empty();
}
}
static final BiFunction<Class<?>, String, Optional<TypeAndValue>> _typeAndValue = (type, value) -> Optional.of(
TypeAndValue.builder().type(type).rawValue(value).build());
}
Extractors. Just an example of how the 'extractors' for the values (contained in strings) might be registered somewhere for lookup. They could be referenced any number of other ways, too.
import gms.labs.cassandra.sandbox.TypeAndValue;
import org.apache.commons.lang3.ObjectUtils;
import org.apache.commons.lang3.math.NumberUtils;
import java.math.BigDecimal;
import java.math.BigInteger;
import java.util.Arrays;
import java.util.List;
public class Extractors
{
private static final List<Class> NUMS = Arrays.asList(
BigInteger.class,
BigDecimal.class,
Long.class,
Integer.class,
Double.class,
Float.class);
public static final Extractor<?> extractorFor(TypeAndValue typeAndValue)
{
if (NUMS.contains(typeAndValue.type())) {
return (Extractor<Number>) value -> NumberUtils.createNumber(value);
} else if(typeAndValue.type().equals(Boolean.class)) {
return (Extractor<Boolean>) value -> Boolean.valueOf(value);
} else if(typeAndValue.type().equals(ObjectUtils.Null.class)) {
return (Extractor<ObjectUtils.Null>) value -> null; // should we just return the raw value. some frameworks coerce to null.
} else if(typeAndValue.type().equals(String.class)) {
return (Extractor<String>) value -> typeAndValue.rawValue(); // just return the raw value. some frameworks coerce to null.
}
else{
throw new RuntimeException("unsupported");
}
}
}
I ran this from within the JavaTypeParser class, for reference.
public static void main(String[] args)
{
Optional<TypeAndValue> num = isNumber("-1230980980980980980980980980980988009808989080989809890808098292");
num.ifPresent(typeAndVal -> {
System.out.println(typeAndVal.value());
System.out.println(typeAndVal.value().getClass()); // BigInteger
});
num = isNumber("-123098098097987");
num.ifPresent(typeAndVal -> {
System.out.println(typeAndVal.value());
System.out.println(typeAndVal.value().getClass()); // Long
});
num = isNumber("-123098.098097987"); // Double
num.ifPresent(typeAndVal -> {
System.out.println(typeAndVal.value());
System.out.println(typeAndVal.value().getClass());
});
num = isNumber("-123009809890898.0980979098098908080987"); // BigDecimal
num.ifPresent(typeAndVal -> {
System.out.println(typeAndVal.value());
System.out.println(typeAndVal.value().getClass());
});
Optional<TypeAndValue> bool = isBoolean("FaLse");
bool.ifPresent(typeAndVal -> {
System.out.println(typeAndVal.value());
System.out.println(typeAndVal.value().getClass()); // Boolean
});
Optional<TypeAndValue> nulll = isNull("null");
nulll.ifPresent(typeAndVal -> {
System.out.println(typeAndVal.value());
//System.out.println(typeAndVal.value().getClass()); would throw null pointer exception
System.out.println(typeAndVal.type()); // ObjectUtils.Null (from apache commons lang3)
});
}

I don't know of any library to do this, and never seen anything working in this way on an open set of possible types.
For closed set of types (you know all the possible output types) the easier way would be to have the class FQN written in the string (from your description I didn't get if you are in control of the written string).
The complete FQN, or an alias to it.
Otherwise I think there is no escape to not write all the checks.
Furthermore it will be very delicate as I'm thinking of edge use case.
Suppose you use json as serialization format in the string, how would you differentiate between a String value like Hello World and a Date written in some ISO format (eg. 2020-09-22). To do it you would need to introduce some priority in the checks you do (first try to check if it is a date using some regex, if not go with the next and the simple string one be the last one)
What if you have two objects:
String name;
String surname;
}
class Employee {
String name;
String surname;
Integer salary
}
And you receive a serialization value of the second type, but with a null salary (null or the property missing completely).
How can you tell the difference between a set or a list?
I don't know if what you intended is so dynamic, or you already know all the possible deserializable types, maybe some more details in the question can help.
UPDATE
Just saw the code, now it seems more clear.
If you know all the possible output, that is the way.
The only changes I would do, would be to ease the increase of types you want to manage abstracting the extraction process.
To do this I think a small change should be done, like:
interface Extractor {
Boolean match(String value);
Object extract(String value);
}
Then you can define an extractor per type:
class NumberExtractor implements Extractor<T> {
public Boolean match(String val) {
return NumberUtils.isCreatable(val);
}
public Object extract(String value) {
return NumberUtils.createNumber(value);
}
}
class StringExtractor implements Extractor {
public Boolean match(String s) {
return true; //<-- catch all
}
public Object extract(String value) {
return value;
}
}
And then register and automatize the checks:
public class JavaTypeParser {
private static final List<Extractor> EXTRACTORS = List.of(
new NullExtractor(),
new BooleanExtractor(),
new NumberExtractor(),
new StringExtractor()
)
public static final List<TypeAndValue> parse(List<Map<String, String>> rows, List<String> headers) {
List<TypeAndValue> typesAndVals = new ArrayList<TypeAndValue>();
for (Map<String, String> row : rows) {
for (String header : headers) {
String val = row.get(header);
typesAndVals.add(extract(header, val));
}
}
}
public static final TypeAndValue extract(String header, String value) {
for (Extractor<?> e : EXTRACTOR) {
if (e.match(value) {
Object v = extractor.extract(value);
return TypeAndValue.builder()
.label(header)
.value(v) //<-- you can put the real value here, and remove the type field
.build()
}
}
throw new IllegalStateException("Can't find an extractor for: " + header + " | " + value);
}
To parse CSV I would suggest https://commons.apache.org/proper/commons-csv as CSV parsing can incur in nasty issues.

What you actually trying to do is to write a parser. You translate a fragment into a parse tree. The parse tree captures the type as well as the value. For hierarchical types like arrays and objects, each tree node contains child nodes.
One of the most commonly used parsers (albeit a bit overkill for your use case) is Antlr. Antlr brings out-of-the-box support for Json.
I recommend to take the time to ingest all the involved concepts. Even though it might seem overkill initially, it quickly pays off when you do any kind of extension. Changing a grammar is relatively easy; the generated code is quite complex. Additionally, all parser generator verify your grammars to show logic errors.
Of course, if you are limiting yourself to just parsing CSV or JSON (and not both at the same time), you should rather take the parser of an existing library. For example, jackson has ObjectMapper.readTree to get the parse tree. You could also use ObjectMapper.readValue(<fragment>, Object.class) to simply get the canonical java classes.

Try this :
import com.fasterxml.jackson.core.JsonFactory;
import com.fasterxml.jackson.databind.JsonNode;
import com.fasterxml.jackson.databind.ObjectMapper;
String j = // json string;
JsonFactory jsonFactory = new JsonFactory();
ObjectMapper jsonMapper = new ObjectMapper(jsonFactory);
JsonNode jsonRootNode = jsonMapper.readTree(j);
Iterator<Map.Entry<String,JsonNode>> jsonIterator = jsonRootNode.fields();
while (jsonIterator.hasNext()) {
Map.Entry<String,JsonNode> jsonField = jsonIterator.next();
String k = jsonField.getKey();
String v = jsonField.getValue().toString();
...
}

Invoking methods in Comparator comparing method

I have an ArrayList<Task> named tasks which I want to print, sorted according to each field of the Task object.
Task class has three private fields { String title, Date date, String project }.
The class has some public get methods that allow other classes to read the fields { getTitle(), getDate(), getProject(), getTaskDetails() }.
I have a simple method that uses a stream to sort and print the ArryaList tasks:
tasks.stream()
.sorted(Comparator.comparing(Task::getProject))
.map(Task::getTaskDetails)
.forEach(System.out::println);
Instead of creating 3 different methods, to sort according each different getter method, I wanted to use Reflection API. But I am having trouble invoking the methods ( getTitle(), getDate(), getProject() ) inside the Comparator comparing method:
Used import java.lang.reflect.Method;
Declared the method Method taskMethod = Task.class.getDeclaredMethod(methodName); where methodName will be the parameter String received with the method name ("getTitle" or "getDate" or "getProject").
Then tried to do something like this, but didn't workout:
tasks.stream()
.sorted(Comparator.comparing(task -> {
try {
taskMethod.invoke(task);
} catch (Exception e) {
e.printStackTrace();
}
}))
.map(Task::getTaskDetails)
.forEach(System.out::println);
Is this possible at all? Or is there any simpler solution?
Only found this question but didn't solve my problem.
Thank you for any feedback.

The answer that you linked to basically contains the core idea, even though it refers to fields ("properties") and not to methods: Create a way of obtaining the desired value from the object, and then simply compare these values for two objects.
One could consider it as a duplicate, but I'm not sure.
In any case:
You should carefully think about whether reflection is the right approach here.
It might be much more elegant (i.e. less hacky) to generate the required comparators without reflection. This could be an enum where you attach the proper Comparator to each enum value:
enum TaskProperty {
TITLE(comparatorForTitle),
DATE(comparatorForDate), ...
}
// Using it:
tasks.stream().sorted(TaskProperty.TITLE.getComparator()).forEach(...);
Or maybe (less type safe, but a tad more flexible), using a map where you can look them up via a string, as in
// Put all comparators into a map
map.put("getDate", compareTaskByDate);
...
// Later, use the string to pick up the comparator from the map:
tasks.stream().sorted(map.get("getDate")).forEach(...);
If you have carefully thought this through, and really want to use the reflection based approach:
You can create a utility method that obtains the return value of a certain method for a given object. Then create a comparator that calls this method for the given objects, and compares the return values. For flexibility sake, you can pass the return values to a downstream comparator (which can be naturalOrder by default).
An example of how this could be done is shown here. But the list of exceptions that are caught in getOptional should make clear that many things can go wrong here, and you should really consider a different approach:
import java.lang.reflect.InvocationTargetException;
import java.lang.reflect.Method;
import java.util.ArrayList;
import java.util.Comparator;
import java.util.List;
public class SortWithReflection
{
public static void main(String[] args)
{
List<Task> tasks = new ArrayList<Task>();
tasks.add(new Task("AAA", 222));
tasks.add(new Task("BBB", 333));
tasks.add(new Task("CCC", 111));
System.out.println("By getTitle:");
tasks.stream()
.sorted(by("getTitle"))
.forEach(System.out::println);
System.out.println("By getDate:");
tasks.stream()
.sorted(by("getDate"))
.forEach(System.out::println);
System.out.println("By getDate, reversed");
tasks.stream()
.sorted(by("getDate", Comparator.naturalOrder().reversed()))
.forEach(System.out::println);
}
private static <T> Comparator<T> by(String methodName)
{
return by(methodName, Comparator.naturalOrder());
}
private static <T> Comparator<T> by(
String methodName, Comparator<?> downstream)
{
#SuppressWarnings("unchecked")
Comparator<Object> uncheckedDownstream =
(Comparator<Object>) downstream;
return (t0, t1) ->
{
Object r0 = getOptional(t0, methodName);
Object r1 = getOptional(t1, methodName);
return uncheckedDownstream.compare(r0, r1);
};
}
private static <T> T getOptional(
Object instance, String methodName)
{
try
{
Class<?> type = instance.getClass();
Method method = type.getDeclaredMethod(methodName);
Object object = method.invoke(instance);
#SuppressWarnings("unchecked")
T result = (T)object;
return result;
}
catch (NoSuchMethodException
| SecurityException
| IllegalAccessException
| IllegalArgumentException
| InvocationTargetException
| ClassCastException e)
{
e.printStackTrace();
return null;
}
}
static class Task
{
String title;
Integer date;
Task(String title, Integer date)
{
this.title = title;
this.date = date;
}
String getTitle()
{
return title;
}
Integer getDate()
{
return date;
}
#Override
public String toString()
{
return title + ": " + date;
}
}
}

Using reflection in most cases is the last option.
Your problem could be solved just providing a key extractor to you method instead of digging properties with Reflection API
Check the code below:
import lombok.Data;
import java.time.Instant;
import java.util.ArrayList;
import java.util.Collection;
import java.util.Comparator;
import java.util.List;
import java.util.function.Function;
#Data
public class Task {
String title;
Instant date;
String project;
public Task(String title, Instant date, String project) {
this.title = title;
this.date = date;
this.project = project;
}
#Override
public String toString() {
return "Task{" +
"title='" + title + '\'' +
", date=" + date +
", project='" + project + '\'' +
'}';
}
public static void sort(Collection<Task> tasks, Function<Task, Comparable> keyExtractor) {
tasks.stream()
.sorted(Comparator.comparing(keyExtractor))
.forEach(System.out::println);
}
public static void main(String[] args) {
List<Task> tasks = new ArrayList<>(3);
tasks.add(new Task("title1", Instant.now().minusMillis(3), "project3"));
tasks.add(new Task("title2", Instant.now().minusMillis(1), "project2"));
tasks.add(new Task("title3", Instant.now().minusMillis(2), "project1"));
System.out.println("Sorted by title");
sort(tasks, Task::getTitle);
System.out.println("Sorted by date");
sort(tasks, Task::getDate);
System.out.println("Sorted by project");
sort(tasks, Task::getProject);
}
}
The output executing main is:
Sorted by title
Task{title='title1', date=2019-10-09T13:42:04.301Z, project='project3'}
Task{title='title2', date=2019-10-09T13:42:04.303Z, project='project2'}
Task{title='title3', date=2019-10-09T13:42:04.302Z, project='project1'}
Sorted by date
Task{title='title1', date=2019-10-09T13:42:04.301Z, project='project3'}
Task{title='title3', date=2019-10-09T13:42:04.302Z, project='project1'}
Task{title='title2', date=2019-10-09T13:42:04.303Z, project='project2'}
Sorted by project
Task{title='title3', date=2019-10-09T13:42:04.302Z, project='project1'}
Task{title='title2', date=2019-10-09T13:42:04.303Z, project='project2'}
Task{title='title1', date=2019-10-09T13:42:04.301Z, project='project3'}

(Predicate<? super String> s) or (String s)

I have a TreeSet of Strings (hardcoded).
Want to check that a given parameter String eg. "Person" if present in the TreeSet then return true otherwise return false.
Here I am confused by the Eclipse message regarding
(Predicate<? super String> s) vs (String s):
The method anyMatch(Predicate) in the type Stream is not applicable for the arguments (String)
Please guide.
import java.util.Set;
import java.util.TreeSet;
import java.util.function.Predicate;
public class SystemLabelValidator {
public static boolean ValidateSystemLabel( String s) {
String j = s;
boolean b = false;
Set <String> SystemLabels = new TreeSet<String>();
// Unique Strings
SystemLabels.add("Person");
SystemLabels.add("Player");
SystemLabels.add("Hospital");
SystemLabels.add("Nurse");
SystemLabels.add("Room");
System.out.println("\n==> Loop here.");
for (String temp : SystemLabels) {
System.out.println(temp);
if(SystemLabels.stream().anyMatch(j)) {
System.out.println("I need to return Boolean");
}
return b;
}
return b;
}
}

There is no need to use a Predicate here. In order to check if the String is present in your TreeSet just use :
return systemLabels.contains("Person");
If you still insist on using anyMatch then you can do :
public static boolean validateSystemLabel(String s) {
return systemLabels.stream().anyMatch(i -> i.equals(s));
}
Remember, a predicate expression needs to evaluate to a boolean value but in the code, you are passing in a String hence the compilation error.

The problem in your solution is this line:
SystemLabels.stream().anyMatch(j);
Basically anyMatch() expects Predicate as input not String.
But your problem has simpler solution:
import java.util.Set;
import java.util.TreeSet;
public class SystemLabelValidator {
private static final Set<String> SYSTEM_LABLES = new TreeSet<>(Arrays.asList("Person", "Player", "Hospital", "Nurse", "Room"));
public static boolean validateSystemLabel( String value) {
return SYSTEM_LABLES.contains(value);
}
}

The signature of anyMatch is
boolean anyMatch(Predicate<? super T> predicate)
In your case, the argument must be a Predicate<? super String>. That is, a method which can take a string and return a boolean. This method is looking for one of the following
Predicate<String> (e.g. String::isEmpty)
Predicate<Object> (e.g. Objects::isNull)
Predicate<CharSequence>
Predicate<Comparable<String>>
Predicate<Serializable>
You have attempted to give it a string, which does not match the signature. One way to fix this would be:
if(SystemLabels.stream().anyMatch(j::equals)) {
System.out.println("I need to return Boolean");
}

Is it possible to group elements without closing the stream?

Is it possible to group elements in a Stream, but then continue streaming instead of having to create a new stream from the EntrySet of the returned map?
For example, I can do this:
public static void main(String[] args) {
// map of access date to list of users
// Person is a POJO with first name, last name, etc.
Map<Date, List<Person>> dateMap = new HashMap<>();
// ...
// output, sorted by access date, then person last name
dateMap.entrySet().stream().sorted(Map.Entry.comparingByKey()).forEach(e -> {
Date date = e.getKey();
// group persons by last name and sort
// this part seems clunky
e.getValue().stream().collect(Collectors.groupingBy(Person::getLastName, Collectors.toSet()))
.entrySet().stream().sorted(Map.Entry.comparingByKey()).forEach(e2 -> {
// pool agent id is the key
String lastName = e2.getKey();
Set<Person> personSet = e2.getValue();
float avgAge = calculateAverageAge(personSet);
int numPersons = personSet.size();
// write out row with date, lastName, avgAge, numPersons
});
});
}
Which works just fine, but seems a little clunky, especially the streaming into a map, and then immediately streaming on the entry set of that map.
Is there a way to group objects in a stream, but continue streaming?

You can shorten your code by using Map.forEach, downstream collectors, TreeMap, and IntSummaryStatistics.
By grouping into a TreeMap (instead of leaving it up to the groupingBy collector), you get the names sorted automatically. Instead of immediately getting the grouped map, you add a summarizingInt collector that turns the list of persons with the same name into IntSummaryStatistics of their ages.
public static void main(String[] args) {
Map<Date, List<Person>> dateMap = new HashMap<>();
dateMap.entrySet().stream().sorted(Map.Entry.comparingByKey()).forEach(e -> {
Date date = e.getKey();
e.getValue().stream()
.collect(Collectors.groupingBy(Person::getLastName,
TreeMap::new,
Collectors.summarizingInt(Person::getAge)))
.forEach((name, stats) -> System.out.println(date +" "+
lastName +" "+
stats.getAverage() +" "+
stats.getCount()));
});
}
If you have control over the type of the initial map, you could use TreeMap there as well, and shorten it further:
public static void main(String[] args) {
Map<Date, List<Person>> dateMap = new TreeMap<>();
dateMap.forEach((date, persons -> { ...

There are several different ways to interpret the question, but if we restate the question as, "Is it possible to group elements within a Stream without using a terminal operation and apply stream operations to the resulting groups within the same stream pipeline," then the answer is "Yes." In this restatement of the question, terminal operation is defined in the way that the Java 8 streams API defines it.
Here is an example that demonstrates this.
import java.util.HashMap;
import java.util.HashSet;
import java.util.Map;
import java.util.Random;
import java.util.Set;
import java.util.function.Consumer;
import java.util.function.Function;
class StreamGrouper {
public static class GroupableObj<K extends Comparable<? super K>, T>
implements Comparable<GroupableObj<K, T>> {
private K key;
private T obj;
private Set<T> setOfObj;
public GroupableObj(K key, T obj) {
if (key == null) {
throw new NullPointerException("Key may not be null");
}
this.key = key;
this.obj = obj;
}
#Override
public int compareTo(GroupableObj<K, T> otherGroupable) {
return key.compareTo(otherGroupable.key);
}
#Override
public boolean equals(Object otherObj) {
if (otherObj == null) {
return false;
}
if (otherObj instanceof GroupableObj) {
GroupableObj<?, ?> otherGroupable =
(GroupableObj<?, ?>)otherObj;
return setOfObj == otherGroupable.setOfObj &&
key.equals(otherGroupable.key);
}
return false;
}
public Set<T> getGroup() {
return setOfObj;
}
public K getKey() {
return key;
}
public T getObject() {
return obj;
}
#Override
public int hashCode() {
return key.hashCode();
}
public void setGroup(Set<T> setOfObj) {
this.setOfObj = setOfObj;
}
}
public static class PeekGrouper<K extends Comparable<? super K>, T>
implements Consumer<GroupableObj<K, T>> {
private Map<K, Set<T>> groupMap;
public PeekGrouper() {
groupMap = new HashMap<>();
}
#Override
public void accept(GroupableObj<K, T> groupable) {
K key = groupable.getKey();
Set<T> group = groupMap.computeIfAbsent(key,
(k) -> new HashSet<T>());
groupable.setGroup(group);
group.add(groupable.getObject());
}
}
public static void main(String[] args) {
Function<Double, Long> myKeyExtractor =
(dblObj) -> Long.valueOf(
(long)(Math.floor(dblObj.doubleValue()*10.0)));
PeekGrouper<Long, Double> myGrouper = new PeekGrouper<>();
Random simpleRand = new Random(20190527L);
simpleRand.doubles(100).boxed().map((dblObj) ->
new GroupableObj<Long, Double>(
myKeyExtractor.apply(dblObj), dblObj)).peek(myGrouper).
distinct().sorted().
map(GroupableObj<Long, Double>::getGroup).
forEachOrdered((grp) -> System.out.println(grp));
}
}
In order to make a program that can be compiled and executed on its own, this example moves away from using the Person objects that are referenced in the question, but the grouping concept is the same, and the code from the question could turn into something like the following.
PeekGrouper<String, Person> myGrouper = new PeekGrouper<>();
e.getValue().stream().map((p) -> new GroupableObj<String, Person>(
p.getLastName(), p)).peek(myGrouper).distinct().sorted().
forEachOrdered(e2 -> {
String lastName = e2.getKey();
Set<Person> personSet = e2.getGroup();
float avgAge = calculateAverageAge(personSet);
int numPersons = personSet.size();
// write out row with date, lastName, avgAge, numPersons
});
Please note that in order for this example to work, it is required that the stream call both the distinct function (which reduces the stream to only a single instance of each group) and the sorted function (which ensures that the entire stream has been processed and the groups have been fully "collected" before processing continues). Also note that as implemented here GroupableObj is not safe to use with parallel streams. If the terminal operation of the stream does not require that the groups be fully "collected" when it processes the objects -- for example, if the terminal operation were something like Collectors.toList() -- then a call to sorted would not be required. The critical point is that any portion of the stream that sees the groups prior to a call to sorted and prior to the end of a terminal operation (including processing during a terminal operation) may see a group that is incomplete.
For the specific example in the question, it may be somewhat less time-efficient to sort the objects before grouping them if many of them are in the same group, but if you are willing to sort the objects before grouping them, you can achieve the same functionality without performing any streaming after doing the grouping. The following is a rewrite of the first example from this answer that demonstrates this.
import java.util.Comparator;
import java.util.HashSet;
import java.util.Random;
import java.util.Set;
import java.util.function.Consumer;
import java.util.function.Function;
import java.util.stream.Collector;
class PreSortOrderedGrouper {
public static void main(String[] args) {
Function<Double, Long> myKeyExtractor =
(dblObj) -> Long.valueOf(
(long)(Math.floor(dblObj.doubleValue()*10.0)));
Random simpleRand = new Random(20190527L);
Consumer<Set<Double>> groupProcessor =
(grp) -> System.out.println(grp);
simpleRand.doubles(100).boxed().sorted(
Comparator.comparing(myKeyExtractor)).
collect(Collector.of(HashSet<Double>::new,
(set, dblObj) -> {
if (set.isEmpty() || myKeyExtractor.apply(set.iterator().
next()) == myKeyExtractor.apply(dblObj)) {
set.add(dblObj);
} else {
groupProcessor.accept(set);
set.clear();
set.add(dblObj);
}
},
(setOne, setTwo) -> {
throw new UnsupportedOperationException();
},
(finalSet) -> {
groupProcessor.accept(finalSet);
return Integer.valueOf(0);
}));
}
}
I can't be sure that either of these examples will feel less "clunky" to you, but if the example in your question is a pattern you use frequently, you could probably adapt one or both of these examples in ways that will suit your purposes and, aside from a few utility classes, result in no more code than you are currently using.

method returning either a collection or a single value

I have a class with various properties and I would like to write a wrapper method around them in order to loop around them more easily.
Some properties return a collection of values, some a single value. And I'm looking for the best approach for this.
My first approach is to let the wrapper method return whatever the property getters return.
public class Test {
public Object getValue(String propName) {
if ("attr1".equals(propName)) return getAttribute1();
else if ("attr2".equals(propName)) return getAttribute2();
else return null;
}
public List<String> getAttribute1() {
return Arrays.asList("Hello","World");
}
public String getAttribute2() {
return "Goodbye";
}
public static void main(String[] args) {
final Test test=new Test();
Stream.of("attr1","attr2")
.forEach(p-> {
Object o=test.getValue(p);
if (o instanceof Collection) {
((Collection) o).forEach(v->System.out.println(v));
}
else {
System.out.println(o);
}
});
}
}
The bad point with this approach is that the caller has to test himself whether the result is a collection or not.
Other approach, seamless for the caller, is to always return a collection, ie. the wrapper function wraps the single values into a Collection. Here an HashSet, but we can imagine an adhoc, minimum 1 element list.
public class TestAlt {
public Collection getValue(String propName) {
if ("attr1".equals(propName))
return getAttribute1();
else if ("attr2".equals(propName)) {
Set s = new HashSet();
s.add(getAttribute2());
return s;
}
else
return null;
}
public List<String> getAttribute1() {
return Arrays.asList("Hello", "World");
}
public String getAttribute2() {
return "Goodbye";
}
public static void main(String[] args) {
final TestAlt test = new TestAlt();
Stream.of("attr1", "attr2")
.forEach(p -> {
test.getValue(p).forEach(v -> System.out.println(v));
});
}
Performance-wise, design-wise, ... what's your opinion on these approaches ? Do you have better ideas ?

Well, you could pass the action to be performed on each attribute to the object and let the object decide on how to handle it. E.g.:
in Class Test:
public void forEachAttribute(String propName, Handler h) {
if ("attr1".equals(propName))
h.handle(getAttribute1());
else if ("attr2".equals(propName)) {
getAttribute2().forEach(o -> h.handle(o))
}
}
And a class Handler with the function handle(String s), that does, what you want to do.
If you cannot edit Test, you can also move the function outside Test
public void forEachTestAttribute(Test t, String propName, Handler h)...
Performance-wise: This removes an if-clause
Design-wise: This removes a cast, but creates more classes.
*Edit: It also maintains type-security, and if there are multiple kinds of attributes (String, int, etc.) you could add more handle-functions, to still maintain type-security.

Regarding the design I would rewrite your code into this:
TestAlt.java
import java.util.*;
import java.util.stream.Stream;
public class TestAlt {
private Map<String, AttributeProcessor> map = AttributeMapFactory.createMap();
public Collection getValue(String propName) {
return Optional
.ofNullable(map.get(propName))
.map(AttributeProcessor::getAttribute)
.orElse(Arrays.asList("default")); //to avoid unexpected NPE's
}
public static void main(String[] args) {
final TestAlt test = new TestAlt();
Stream.of("attr1", "attr2")
.forEach(p -> test.getValue(p).forEach(v -> System.out.println(v)));
}
}
AttributeMapFactory.java
import java.util.HashMap;
import java.util.Map;
public class AttributeMapFactory {
public static Map<String, AttributeProcessor> createMap() {
Map<String, AttributeProcessor> map = new HashMap<>();
map.put("attr1", new HiAttributeProcessor());
map.put("attr2", new ByeAttributeProcessor());
return map;
}
}
AttributeProcessor.java
import java.util.Collection;
public interface AttributeProcessor {
Collection<String> getAttribute();
}
HiAttributeProcessor.java
import java.util.Arrays;
import java.util.Collection;
public class HiAttributeProcessor implements AttributeProcessor{
#Override
public Collection<String> getAttribute() {
return Arrays.asList("Hello", "World");
}
}
ByeAttributeProcessor.java
import java.util.Arrays;
import java.util.Collection;
public class ByeAttributeProcessor implements AttributeProcessor{
#Override
public Collection<String> getAttribute() {
return Arrays.asList("Goodbye");
}
}
The main point is that you get rid of if-else statements using map and dynamic dispatch.
The main advantage of this approach is that your code becomes more flexible to further changes. In case of this small programm it does not really matter and is an overkill. But if we are talking about large enterprise application, then yes, it becomes crucial.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Group by object property in java flux - java

Related

Is there a way to convert a String to a Java type using Jackson and/or one of its associated libraries (csv, json, etc.)

Invoking methods in Comparator comparing method

(Predicate<? super String> s) or (String s)

Is it possible to group elements without closing the stream?

method returning either a collection or a single value

Categories

Resources