Java 8 Stream Map Reduce - java

I'm totally new on java 8 streams and I'm trying to obtain the behavior below described:
class myVO {
Long id;
BigDecimal value;
Date date;
getter/setter
}
myVO method(Map<Long, myVO> inputMap) {
return inputMap.stream()
.filter(x -> x.getValue().compareTo(BigDecimal.ZERO) > 0)
.sorted(); //FIXME
}
I want to obtain only one myVO obj that is the SUM of BigDecimal values of the records with the same date (the lowest one).
e.g.
xL, 10, 2015/07/07
xL, 15, 2015/07/08
xL, 20, 2015/07/07
xL, 25, 2015/07/09
result
xL, 30, 2015/07/07
N.B. id (xL) is not important field.
UPDATE -- adopted solution (although not in single pass)
if(null != map && !map.isEmpty()) {
Date closestDate = map.values().stream()
.filter(t -> t.getDate() != null)
.map(MyVO::getDate)
.min(Comparator.naturalOrder()).orElse(null);
myVO.setDate(closestDate);
BigDecimal totalValue = map.values().stream()
.filter(x -> x.getValue() != null && x.getValue().signum() != 0)
.filter(t -> t.getDate().equals(closestDate))
.map(MyVO::getValue)
.reduce(BigDecimal::add).orElse(null);
myVO.setValue(totalValue != null ? totalValue.setScale(2, BigDecimal.ROUND_HALF_DOWN) : totalValue);
}

Considering inputMap has atleast one entry, it can be done like this :
myVO method(Map<Long, myVO> inputMap) {
Date minDate = inputMap.values().stream().map(myVO::getDate).min(Comparator.naturalOrder()).get();
BigDecimal sum = inputMap.values().stream().filter(t -> t.getDate().equals(minDate)).map(myVO::getValue).reduce(BigDecimal::add).get();
myVO myVOObj = new myVO();
myVOObj.setDate(minDate);
myVOObj.setValue(sum);
myVOObj.setId(??);
return myVOObj;
}

I wrote a custom collector to solve such tasks. It's available in my StreamEx library and called MoreCollectors.maxAll(downstream). Using it you can solve the task in single pass after some preparations.
First, your compareTo method is wrong. It never returns 0 and practically it violates the contract (a.compareTo(a) == -1 which violates reflexivity and antisymmetry properties). It can be easily fixed like this:
#Override
public int compareTo(myVO o) {
return o.getDate().compareTo(this.getDate());
}
Next, let's add a myVO.merge() method which can merge two myVO objects according to your requirements:
public myVO merge(myVO other) {
myVO result = new myVO();
result.setId(getId());
result.setDate(getDate());
result.setValue(getValue().add(other.getValue()));
return result;
}
Now the result can be found like this:
Optional<myVO> result = input.stream().filter(x -> x.getValue().signum() > 0)
.collect(MoreCollectors.maxAll(Collectors.reducing(myVO::merge)));
The resulting Optional will be empty if the input list is empty.
If you don't like to depend on third-party library, you can just check the source code of this collector and write something similar in your project.

The reduction isn’t that complicated if you think about it. You have to specify a reduction function which:
if the dates of two elements differ, returns the object with the lower date
if they match, create a result object containing the sum of the values
Compared to a two-step method, it may perform more add operations whose result will be dropped when a lower date appears in the stream, on the other hand, the number of performed date comparisons will be half.
MyVO method(Map<Long, MyVO> inputMap) {
return inputMap.values().stream()
.reduce((a,b)->{
int cmp=a.getDate().compareTo(b.getDate());
if(cmp==0)
{
MyVO r=new MyVO();
r.setDate(a.date);
r.setValue(a.value.add(b.value));
return r;
}
return cmp<0? a: b;
}).orElse(null);
}
The main reason it doesn’t look concise is that it has to create a new MyVO instance holding the sum in the case of a matching date as a reduction function must not modify the value objects. And you didn’t specify which constructors exist. If there is an appropriate constructor receiving a Date and BigDecimal, the function could be almost a one-liner.
Note that this method will return an original MyVO object if there is only a single one with a lowest date.
Alternatively you can use a mutable reduction, always creating a new MyVO instance holding the result but only creating one instance per thread and modifying that new instance during the reduction:
MyVO method(Map<Long, MyVO> inputMap) {
BiConsumer<MyVO, MyVO> c=(a,b)->{
Date date = a.getDate();
int cmp=date==null? 1: date.compareTo(b.getDate());
if(cmp==0) a.setValue(a.getValue().add(b.getValue()));
else if(cmp>0)
{
a.setValue(b.getValue());
a.setDate(b.getDate());
}
};
return inputMap.values().stream().collect(()->{
MyVO r = new MyVO();
r.setValue(BigDecimal.ZERO);
return r;
}, c, c);
}
Here, the Supplier could be a one-line if an appropriate constructor exists (or if the initial value is guaranteed to be the non-null BigDecimal.ZERO)…

Related

Return first non-null value

I have a number of functions:
String first(){}
String second(){}
...
String default(){}
Each can return a null value, except the default. each function can take different parameters. For example, first could take no arguments, second could take in a String, third could take three arguments, etc. What I'd like to do is something like:
ObjectUtils.firstNonNull(first(), second(), ..., default());
The problem is that because of the function call, this does eager evaluation. Where'd I'd like to exit early, say after the second function (because the function calls can be expensive, think API calls, etc). In other languages, you can do something similar to this:
return first() || second() || ... || default()
In Java, I know I can do something like:
String value;
if (value = first()) == null || (value = second()) == null ...
return value;
That's not very readable IMO because of all the == null checks.ObjectUtils.firstNonNull() creates a collection first, and then iterates, which is okay as long as the function gets evaluated lazily.
Suggestions? (besides doing a bunch of ifs)
String s = Stream.<Supplier<String>>of(this::first, this::second /*, ... */)
.map(Supplier::get)
.filter(Objects::nonNull)
.findFirst()
.orElseGet(this::defaultOne);
It stops on the first non-null value or else sets the value which is returned from defaultOne. As long as you stay sequential, you are safe. Of course this requires Java 8 or later.
The reason why it stops on the first occurrence of a non-null value is due how the Stream handles each step. The map is an intermediate operation, so is filter. The findFirst on the other side is a short-circuiting terminal operation. So it continues with the next element until one matches the filter. If no element matches an empty optional is returned and so the orElseGet-supplier is called.
this::first, etc. are just method references. If they are static replace it with YourClassName::first, etc.
Here is an example if the signature of your methods would differ:
String s = Stream.<Supplier<String>>of(() -> first("takesOneArgument"),
() -> second("takes", 3, "arguments")
/*, ... */)
.map(Supplier::get)
.filter(Objects::nonNull)
.findFirst()
.orElseGet(this::defaultOne);
Note that the Supplier is only evaluated when you call get on it. That way you get your lazy evaluation behaviour. The method-parameters within your supplier-lambda-expression must be final or effectively final.
This can be done pretty cleanly with a stream of Suppliers.
Optional<String> result = Stream.<Supplier<String>> of(
() -> first(),
() -> second(),
() -> third() )
.map( x -> x.get() )
.filter( s -> s != null)
.findFirst();
The reason this works is that despite appearances, the whole execution is driven by findFirst(), which pulls an item from filter(), which lazily pulls items from map(), which calls get() to handle each pull. findFirst() will stop pulling from the stream when one item has passed the filter, so subsequent suppliers will not have get() called.
Although I personally find the declarative Stream style cleaner and more expressive, you don't have to use Stream to work with Suppliers if you don't like the style:
Optional<String> firstNonNull(List<Supplier<String>> suppliers {
for(Supplier<String> supplier : suppliers) {
String s = supplier.get();
if(s != null) {
return Optional.of(s);
}
}
return Optional.empty();
}
It should be obvious how instead of returning Optional you could equally return a String, either returning null (yuk), a default string, or throwing an exception, if you exhaust options from the list.
It isn't readable because you are dealing with a bunch of separate functions that don't express any kind of connection with each other. When you attempt to put them together, the lack of direction is apparent.
Instead try
public String getFirstValue() {
String value;
value = first();
if (value != null) return value;
value = second();
if (value != null) return value;
value = third();
if (value != null) return value;
...
return value;
}
Will it be long? Probably. But you are applying code on top of a interface that's not friendly toward your approach.
Now, if you could change the interface, you might make the interface more friendly. A possible example would be to have the steps be "ValueProvider" objects.
public interface ValueProvider {
public String getValue();
}
And then you could use it like
public String getFirstValue(List<ValueProvider> providers) {
String value;
for (ValueProvider provider : providers) {
value = provider.getValue();
if (value != null) return value;
}
return null;
}
And there are various other approaches, but they require restructuring the code to be more object-oriented. Remember, just because Java is an Object-Oriented programming language, that doesn't mean it will always be used in an Object-Oriented manner. The first()...last() method listing is very not-object oriented, because it doesn't model a List. Even though the method names are expressive, a List has methods on it which permit easy integration with tools like for loops and Iterators.
If you are using java 8 you can convert these function calls to lambdas.
public static<T> T firstNonNull(Supplier<T> defaultSupplier, Supplier<T>... funcs){
return Arrays.stream(funcs).filter(p -> p.get() != null).findFirst().orElse(defaultSupplier).get();
}
If you don't want the generic implementation and use it only for Strings go on and just replace T with String:
public static String firstNonNull(Supplier<String> defaultSupplier, Supplier<String>... funcs){
return Arrays.stream(funcs).filter(p -> p.get() != null).findFirst().orElse(defaultSupplier).get();
}
And then call it like:
firstNonNull(() -> getDefault(), () -> first(arg1, arg2), () -> second(arg3));
P.S. btw default is a reserved keyword, so you cannot use it as a method name :)
EDIT: ok, the best way to do this would be to return Optional, then you don't need to pass default supplier separetely:
#SafeVarargs
public static<T> Optional<T> firstNonNull(Supplier<T>... funcs){
return Arrays.stream(funcs).filter(p -> p.get() != null).map(s -> s.get()).findFirst();
}
If you want to package it up into a utility method, you'll have to wrap each function up into something that defers execution. Perhaps something like this:
public interface Wrapper<T> {
T call();
}
public static <T> T firstNonNull(Wrapper<T> defaultFunction, Wrapper<T>... funcs) {
T val;
for (Wrapper<T> func : funcs) {
if ((val = func.call()) != null) {
return val;
}
}
return defaultFunction.call();
}
You could use java.util.concurrent.Callable instead of defining your own Wrapper class, but then you'd have to deal with the exception that Callable.call() is declared to throw.
This can then be called with:
String value = firstNonNull(
new Wrapper<>() { #Override public String call() { return defaultFunc(); },
new Wrapper<>() { #Override public String call() { return first(); },
new Wrapper<>() { #Override public String call() { return second(); },
...
);
In Java 8, as #dorukayhan points out, you can dispense with defining your own Wrapper class and just use the Supplier interface. Also, the call can be done much more cleanly with lambdas:
String value = firstNonNull(
() -> defaultFunc(),
() -> first(),
() -> second(),
...
);
You can also (as #Oliver Charlesworth suggests) use method references as shorthand for the lambda expressions:
String value = firstNonNull(
MyClass::defaultFunc,
MyClass::first,
MyClass::second,
...
);
I'm of two minds as to which is more readable.
Alternatively, you can use one of the streaming solutions that many other answers have proposed.
Just make a class with one function like this:
class ValueCollector {
String value;
boolean v(String val) { this.value = val; return val == null; }
}
ValueCollector c = new ValueCollector();
if c.v(first()) || c.v(second()) ...
return c.value;
The above examples seemed too long for just choosing between 2 variables, I'd go with something like this (unless you've got a longer list of variables to chose from):
Optional.ofNullable(first).orElse(Optional.ofNullable(second).orElse(default));
You can accomplish this via reflection:
public Object getFirstNonNull(Object target, Method... methods) {
Object value = null;
for (Method m : methods) {
if ( (value = m.invoke(target)) != null) {
break;
}
}
return value;
}

Best way to traverse and find an object field from a list

I have a list of Custom object and i want to find an object by given an Id(a field in custom object). i was coding for this so i found two solutions when comparing fields.
1
private Product getProduct(String productId,List<Product> productList){
for (int i = 0; i < productList.size(); i++) {
if (productId.equals(productList.get(i).getId())) {
return productList.get(i);
}
}
return null;
}
2.
private Product getProduct(String productId,List<Product> productList){
for (int i = 0; i < productList.size(); i++) {
if (productList.get(i).getId().equals(productId)) {
return productList.get(i);
}
}
return null;
}
The difference is in if condition , i want to know which one is better than the other and why, when to use 1st method and when to use second ?
Since equals() is required by Java to be symmetric, there is no difference between the two snippets.
Both snippets are sub-optimal, in that they iterate by numeric index, and retrieve productList.get(i) twice before returning it. Iterating by index is especially dangerous, because passing a LinkedList<Product> will slow down your search considerably.
A better approach is to use a for-each form of the loop:
for (Product p : productList) {
if (p.getId().equals(productId)) {
return p;
}
}
return null;
The concern in both of your implementations is the possibility of calling .equals on a null value.
If you can guarantee neither of them are null then they are equivalent.
If you are using Java 8, stream may be a better choice.
private Product getProduct(String productId,List<Product> productList){
return products.stream()
.filter(p-> productId.equals(p.getId())
.findFirst()
.orElse(null);
When you are sure the product id's are never null it doesn't really matter.
But in general it's always good to program in a defensive way, so for example prefer using
"SomeString".equals(aString)
instead of
aString.equals("SomeString")
since you know "SomeString" is never null.
Or use
Objects.equals(object1, object2)
when both objects might be null.
The first one invokes equals on the parameter productId, while the second one invokes equals on the current list element from productList. The result is the same because equals is symmetric:
for any non-null reference values x and y, x.equals(y) should return true if and only if y.equals(x) returns true.
You can also use a stream for this, so you don't have to care about implementation details (furthermore, Objects#equals(Object, Object) is null-safe):
String p = productList.stream().filter(e -> Objects.equals(e, productId))
.findFirst()
.orElse(null);
Have a look a this question for further information.

Java lambda to return null if empty list otherwise sum of values?

If I want to total a list of accounts' current balances, I can do:
accountOverview.setCurrentBalance(account.stream().
filter(a -> a.getCurrentBalance() != null).
mapToLong(a -> a.getCurrentBalance()).
sum());
But this expression will return 0, even if all the balances are null. I would like it to return null if all the balances are null, 0 if there are non-null 0 balances, and the sum of the balances otherwise.
How can I do this with a lambda expression?
Many thanks
Once you filtered them from the stream, there's no way to know if all the balances were null (unless check what count() returns but then you won't be able to use the stream since it's a terminal operation).
Doing two passes over the data is probably the straight-forward solution, and I would probably go with that first:
boolean allNulls = account.stream().map(Account::getBalance).allMatch(Objects::isNull);
Long sum = allNulls ? null : account.stream().map(Account::getBalance).filter(Objects::nonNull).mapToLong(l -> l).sum();
You could get rid of the filtering step with your solution with reduce, although the readability maybe not be the best:
Long sum = account.stream()
.reduce(null, (l1, l2) -> l1 == null ? l2 :
l2 == null ? l1 : Long.valueOf(l1 + l2));
Notice the Long.valueOf call. It's to avoid that the type of the conditional expression is long, and hence a NPE on some edge cases.
Another solution would be to use the Optional API. First, create a Stream<Optional<Long>> from the balances' values and reduce them:
Optional<Long> opt = account.stream()
.map(Account::getBalance)
.flatMap(l -> Stream.of(Optional.ofNullable(l)))
.reduce(Optional.empty(),
(o1, o2) -> o1.isPresent() ? o1.map(l -> l + o2.orElse(0L)) : o2);
This will give you an Optional<Long> that will be empty if all the values were null, otherwise it'll give you the sum of the non-null values.
Or you might want to create a custom collector for this:
class SumIntoOptional {
private boolean allNull = true;
private long sum = 0L;
public SumIntoOptional() {}
public void add(Long value) {
if(value != null) {
allNull = false;
sum += value;
}
}
public void merge(SumIntoOptional other) {
if(!other.allNull) {
allNull = false;
sum += other.sum;
}
}
public OptionalLong getSum() {
return allNull ? OptionalLong.empty() : OptionalLong.of(sum);
}
}
and then:
OptionalLong opt = account.stream().map(Account::getBalance).collect(SumIntoOptional::new, SumIntoOptional::add, SumIntoOptional::merge).getSum();
As you can see, there are various ways to achieve this, so my advice would be: choose the most readable first. If performance problems arise with your solution, check if it could be improved (by either turning the stream in parallel or using another alternative). But measure, don't guess.
For now, I'm going with this. Thoughts?
accountOverview.setCurrentBalance(account.stream().
filter(a -> a.getCurrentBalance() != null).
map(a -> a.getCurrentBalance()).
reduce(null, (i,j) -> { if (i == null) { return j; } else { return i+j; } }));
Because I've filtered nulls already, I'm guaranteed not to hit any. By making the initial param to reduce 'null', I can ensure that I get null back on an empty list.
Feels a bit hard/confusing to read though. Would like a nicer solution..
EDIT Thanks to pbabcdefp, I've gone with this rather more respectable solution:
List<Account> filtered = account.stream().
filter(a -> a.getCurrentBalance() != null).
collect(Collectors.toList());
accountOverview.setCurrentBalance(filtered.size() == 0?null:
filtered.stream().mapToLong(a -> a.getCurrentBalance()).
sum());
You're trying to do two fundamentally contradicting things: filter out null elements (which is a local operation, based on a single element) and detect when all elements are null (which is a global operation, based on the entire list). Normally you should do these as two separate operations, that makes things a lot more readable.
Apart from the reduce() trick you've already found, you can also resort to underhand tricks, if you know that balance can never be negative for example, you can do something like
long sum = account.stream().
mapToLong(a -> a.getCurrentBalance() == null ? 0 : a.getCurrentBalance()+1).
sum() - account.size();
Long nullableSum = sum < 0 ? null : sum;
But you've got to ask yourself: is what you gain by only iterating across your collection once worth the cost of having written a piece of unreadable and fairly brittle code? In most cases the answer will be: no.

Object as a key in treemap in java 8

CompareObj is a class in java It consists of three attributes String rowKey, Integer hitCount, Long recency
public CompareObj(String string, Integer i) {
this.rowKey = string;
this.hitCount = i%10;
this.recency= (Long) i*1000;
}
Now I created a treeMap
Comparator<CompareObj> comp1 = (e1,e2) -> e1.getHitCount().compareTo(e2.getHitCount());
Comparator<CompareObj> comp2 = (e1,e2) -> e2.getRecency().compareTo(e1.getRecency());
Comparator<CompareObj> result = comp1.thenComparing(comp2);
TreeMap<CompareObj, CompareObj> tM = new TreeMap<CompareObj, CompareObj>(result);
for(int i=0;i<=1000;i++)
{
CompareObj cO = new CompareObj("A"+i, i);
tM.put(cO,cO);
}
for(int i=0;i<=1000;i++)
{
CompareObj cO = new CompareObj("A"+i, i);
CompareObj values = tM.get(cO);
System.out.println(values.getRowKey()); // Line 28: get Null Pointer Exception
}
Also I overide hashCode and Equals. Still I get nullponter exception.
#Override
public int hashCode() {
return Objects.hash(getRowKey());
}
#Override
public boolean equals(Object obj) {
if(this==obj) return true;
if(!(obj instanceof CompareObj)) return false;
CompareObj compareObj = (CompareObj) obj;
return Objects.equals(this.getRowKey(), compareObj.getRowKey());
}
Here when I try to retrive value from treemap back I get Null Pointer exception in the line mentioned. How to solve this
If I want to implement comapareTo() of Comaprable interface, how should I implement if there is multiple sort conditions.
The first thing to understand, is the NullPointerException. If you get that exception on the exact line
System.out.println(values.getRowKey());
then either System.out or values is null. Since we can preclude System.out being null, it’s the values variable, which contains the result of get and can be null if the lookup failed.
Since you are initializing the TreeMap with a custom Comparator, that Comparatordetermines equality. Your Comparator is based on the properties getHitCount() and getRecency() which must match, which implies that when the lookup fails, the map doesn’t contain an object having the same values as reported by these two methods.
You show that you construct objects with the same values but not the code of these getters. There must be an inconsistency. As Misha pointed out, your posted code can’t be the code you have ran when getting the exception, therefore we can’t help you further (unless you post the real code you ran).

How to implement efficient hash cons with java HashSet

I am trying to implement a hash cons in java, comparable to what String.intern does for strings. I.e., I want a class to store all distinct values of a data type T in a set and provide an T intern(T t) method that checks whether t is already in the set. If so, the instance in the set is returned, otherwise t is added to the set and returned. The reason is that the resulting values can be compared using reference equality since two equal values returned from intern will for sure also be the same instance.
Of course, the most obvious candidate data structure for a hash cons is java.util.HashSet<T>. However, it seems that its interface is flawed and does not allow efficient insertion, because there is no method to retrieve an element that is already in the set or insert one if it is not in there.
An algorithm using HashSet would look like this:
class HashCons<T>{
HashSet<T> set = new HashSet<>();
public T intern(T t){
if(set.contains(t)) {
return ???; // <----- PROBLEM
} else {
set.add(t); // <--- Inefficient, second hash lookup
return t;
}
}
As you see, the problem is twofold:
This solution would be inefficient since I would access the hash table twice, once for contains and once for add. But okay, this may not be a too big performance hit since the correct bucket will be in the cache after the contains, so add will not trigger a cache miss and thus be quite fast.
I cannot retrieve an element already in the set (see line flagged PROBLEM). There is just no method to retrieve the element in the set. So it is just not possible to implement this.
Am I missing something here? Or is it really impossible to build a usual hash cons with java.util.HashSet?
I don't think it's possible using HashSet. You could use some kind of Map instead and use your value as key and as value. The java.util.concurrent.ConcurrentMap also happens to posess the quite convenient method
putIfAbsent(K key, V value)
that returns the value if it is already existent. However, I don't know about the performance of this method (compared to checking "manually" on non-concurrent implementations of Map).
Here is how you would do it using a HashMap:
class HashCons<T>{
Map<T,T> map = new HashMap<T,T>();
public T intern(T t){
if (!map.containsKey(t))
map.put(t,t);
return map.get(t);
}
}
I think the reason why it is not possible with HashSet is quite simple: To the set, if contains(t) is fulfilled, it means that the given t also equals one of the t' in the set. There is no reason for being able return it (as you already have it).
Well HashSet is implemented as HashMap wrapper in OpenJDK, so you won't win in memory usage comparing to solution suggested by aRestless.
10-min sketch
class HashCons<T> {
T[] table;
int size;
int sizeLimit;
HashCons(int expectedSize) {
init(Math.max(Integer.highestOneBit(expectedSize * 2) * 2, 16));
}
private void init(int capacity) {
table = (T[]) new Object[capacity];
size = 0;
sizeLimit = (int) (capacity * 2L / 3);
}
T cons(#Nonnull T key) {
int mask = table.length - 1;
int i = key.hashCode() & mask;
do {
if (table[i] == null) break;
if (key.equals(table[i])) return table[i];
i = (i + 1) & mask;
} while (true);
table[i] = key;
if (++size > sizeLimit) rehash();
return key;
}
private void rehash() {
T[] table = this.table;
if (table.length == (1 << 30))
throw new IllegalStateException("HashCons is full");
init(table.length << 1);
for (T key : table) {
if (key != null) cons(key);
}
}
}

Categories