This is a question regarding a possibly confusing Streams behavior. I came under the impression that the .map operaion (because of it's usage in Optional) was always null-safe (I'm aware that these .map's are different implementations thought they share the same name). And I was quite surprised when I got a NPE when I used it so in a (list) stream. Since then, I started using Objects::nonNull with streams (both with .map & .flatMap operations).
Q1. Why is it that Optional can handle nulls at any level, whereas Streams can't (at any level), as shown in my test code below? If this is the sensible and desirable behavior, please give an explanation (as to it's benefits, or the downsides of List Stream behaving like Optional).
Q2. As a follow up, is there an alternative to the excessive null-checks that I perform in the getValues method below (which is what prompted me to think why Streams could not behave like Optional).
In the below test, I'm interested in the innermost class's value field only.
I use Optional in getValue method.
I use Streams on list in getValues method. And I cannot remove a single nonNull check in this case.
import lombok.AllArgsConstructor;
import lombok.Getter;
import java.util.Arrays;
import java.util.List;
import java.util.Objects;
import java.util.Optional;
import java.util.stream.Collectors;
public class NestedObjectsStreamTest {
#Getter #AllArgsConstructor
private static class A {
private B b;
}
#Getter #AllArgsConstructor
private static class B {
private C c;
}
#Getter #AllArgsConstructor
private static class C {
private D d;
}
#Getter #AllArgsConstructor
private static class D {
private String value;
}
public static void main(String[] args) {
A a0 = new A(new B(new C(new D("a0"))));
A a1 = new A(new B(new C(new D("a1"))));
A a2 = new A(new B(new C(new D(null))));
A a3 = new A(new B(new C(null)));
A a5 = new A(new B(null));
A a6 = new A(null);
A a7 = null;
System.out.println("getValue(a0) = " + getValue(a0));
System.out.println("getValue(a1) = " + getValue(a1));
System.out.println("getValue(a2) = " + getValue(a2));
System.out.println("getValue(a3) = " + getValue(a3));
System.out.println("getValue(a5) = " + getValue(a5));
System.out.println("getValue(a6) = " + getValue(a6));
System.out.println("getValue(a7) = " + getValue(a7));
List<A> aList = Arrays.asList(a0, a1, a2, a3, a5, a6, a7);
System.out.println("getValues(aList) " + getValues(aList));
}
private static String getValue(final A a) {
return Optional.ofNullable(a)
.map(A::getB)
.map(B::getC)
.map(C::getD)
.map(D::getValue)
.orElse("default");
}
private static List<String> getValues(final List<A> aList) {
return aList.stream()
.filter(Objects::nonNull)
.map(A::getB)
.filter(Objects::nonNull)
.map(B::getC)
.filter(Objects::nonNull)
.map(C::getD)
.filter(Objects::nonNull)
.map(D::getValue)
.filter(Objects::nonNull)
.collect(Collectors.toList());
}
}
Output
getValue(a0) = a0
getValue(a1) = a1
getValue(a2) = default
getValue(a3) = default
getValue(a5) = default
getValue(a6) = default
getValue(a7) = default
getValues(aList) [a0, a1]
Q1. Why is it that Optional can handle nulls at any level, whereas Streams can't (at any level), as shown in my test code below?
A Stream can "contain" null values. An Optional can't, by contract (the contract is explained in the javadoc): either it's empty and map returns empty, or it's not empty and is then guaranteed to have a non-null value.
Q2. As a follow up, is there an alternative to the excessive null-checks that I perform in the getValues method below.
Favor designs which avoid using nulls all over the place.
Here are the codes you can try:
aList.stream()
.map(applyIfNotNull(A::getB))
.map(applyIfNotNull(B::getC))
.map(applyIfNotNull(C::getD))
.map(applyIfNotNullOrDefault(D::getValue, "default"))
.filter(Objects::nonNull)
.forEach(System.out::println);
With the below utility methods:
public static <T, U> Function<T, U> applyIfNotNull(Function<T, U> mapper) {
return t -> t != null ? mapper.apply(t) : null;
}
public static <T, U> Function<T, U> applyIfNotNullOrDefault(Function<T, U> mapper, U defaultValue) {
return t -> t != null ? mapper.apply(t) : defaultValue;
}
public static <T, U> Function<T, U> applyIfNotNullOrElseGet(Function<T, U> mapper, Supplier<U> supplier) {
return t -> t != null ? mapper.apply(t) : supplier.get();
}
Not sure how it looks to you. but I personally don't like map(...).map(...)....
Here what I like more:
aList.stream()
.map(applyIfNotNull(A::getB, B::getC, C::getD))
.map(applyIfNotNullOrDefault(D::getValue, "default"))
.filter(Objects::nonNull)
.forEach(System.out::println);
With One more utility method:
public static <T1, T2, T3, R> Function<T1, R> applyIfNotNull(Function<T1, T2> mapper1, Function<T2, T3> mapper2,
Function<T3, R> mapper3) {
return t -> {
if (t == null) {
return null;
} else {
T2 t2 = mapper1.apply(t);
if (t2 == null) {
return null;
} else {
T3 t3 = mapper2.apply(t2);
return t3 == null ? null : mapper3.apply(t3);
}
}
};
}
Q1. Why is it that Optional can handle nulls at any level, whereas
Streams can't (at any level), as shown in my test code below?
Optional was created to handle the null values on itself, i.e. where the programmer did not want to handle the Nulls by himself. The Optional.map() method converts the value to an Optional object. Thus handling the nulls and taking away the responsibility from the developers.
Streams, on the other hand, leaves the handling of nulls on the developers, i.e. what if a developer might want to handle the nulls in a different way. Look at this link. It provides different choices the developers of the Stream had and their way of reasoning on each of the cases.
Q2. As a follow-up, is there an alternative to the excessive
null-checks that I perform in the getValues method below
In cases of uses like you mentioned, where you don't want to handle the null cases, go with the Optional. As #JB Nizet said, avoid null scenarios in case of using streams or handle it yourself. This argues similarly. If you go through the first link I shared, you would probably get that Banning Null from a stream would be too harsh, and Absorbing Null would hamper the truthfulness of size() method and other functionalities.
Your Q1 has already been answered by raviiii1 and JB Nizet.
Regarding your Q2:
is there an alternative to the excessive null-checks that I perform in the getValues method below
You could always combine both Stream and Optional like this:
private static List<String> getValues(final List<A> aList) {
return aList.stream()
.map(Optional::ofNullable)
.map(opta -> opta.map(A::getB))
.map(optb -> optb.map(B::getC))
.map(optc -> optc.map(C::getD))
.map(optd -> optd.map(D::getValue))
.map(optv -> optv.orElse("default"))
.collect(Collectors.toList());
}
Of course, this would be much cleaner:
private static List<String> getValues(final List<A> aList) {
return aList.stream()
.map(NestedObjectsStreamTest::getValue)
.collect(Collectors.toList());
}
Related
A Collector has three generic types:
public interface Collector<T, A, R>
With A being the mutable accumulation type of the reduction operation (often hidden as an implementation detail).
If I want to create my custom collector, I need to create two classes:
one for the custom accumulation type
one for the custom collector itself
Is there any library function/trick that takes the accumulation type and provides a corresponding Collector?
Simple example
This example is extra simple to illustrate the question, I know I could use reduce for this case, but this is not what I am looking for. Here is a more complex example that sharing here would make the question too long, but it is the same idea.
Let's say I want to collect the sum of a stream and return it as a String.
I can implement my accumulator class:
public static class SumCollector {
Integer value;
public SumCollector(Integer value) {
this.value = value;
}
public static SumCollector supply() {
return new SumCollector(0);
}
public void accumulate(Integer next) {
value += next;
}
public SumCollector combine(SumCollector other) {
return new SumCollector(value + other.value);
}
public String finish(){
return Integer.toString(value);
}
}
And then I can create a Collector from this class:
Collector.of(SumCollector::supply, SumCollector::accumulate, SumCollector::combine, SumCollector::finish);
But it seems strange to me that they all refer to the the other class, I feel that there is a more direct way to do this.
What I could do to keep only one class would be implements Collector<Integer, SumCollector, String> but then every function would be duplicated (supplier() would return SumCollector::supply, etc).
There is no requirement for the functions to be implemented as methods of the container class.
This is how such a sum collector would be typically implemented
public static Collector<Integer, ?, Integer> sum() {
return Collector.of(() -> new int[1],
(a, i) -> a[0] += i,
(a, b) -> { a[0] += b[0]; return a; },
a -> a[0],
Collector.Characteristics.UNORDERED);
}
But, of course, you could also implement it as
public static Collector<Integer, ?, Integer> sum() {
return Collector.of(AtomicInteger::new,
AtomicInteger::addAndGet,
(a, b) -> { a.addAndGet(b.intValue()); return a; },
AtomicInteger::intValue,
Collector.Characteristics.UNORDERED, Collector.Characteristics.CONCURRENT);
}
You first have to find a suitable mutable container type for your collector. If no such type exists, you have to create your own class. The functions can be implemented as a method reference to an existing method or as a lambda expression.
For the more complex example, I don’t know of a suitable existing type for holding an int and a List, but you may get away with a boxed Integer, like this
final Map<String, Integer> map = …
List<String> keys = map.entrySet().stream().collect(keysToMaximum());
public static <K> Collector<Map.Entry<K,Integer>, ?, List<K>> keysToMaximum() {
return Collector.of(
() -> new AbstractMap.SimpleEntry<>(new ArrayList<K>(), Integer.MIN_VALUE),
(current, next) -> {
int max = current.getValue(), value = next.getValue();
if(value >= max) {
if(value > max) {
current.setValue(value);
current.getKey().clear();
}
current.getKey().add(next.getKey());
}
}, (a, b) -> {
int maxA = a.getValue(), maxB = b.getValue();
if(maxA <= maxB) return b;
if(maxA == maxB) a.getKey().addAll(b.getKey());
return a;
},
Map.Entry::getKey
);
}
But you may also create a new dedicated container class as an ad-hoc type, not visible outside the particular collector
public static <K> Collector<Map.Entry<K,Integer>, ?, List<K>> keysToMaximum() {
return Collector.of(() -> new Object() {
int max = Integer.MIN_VALUE;
final List<K> keys = new ArrayList<>();
}, (current, next) -> {
int value = next.getValue();
if(value >= current.max) {
if(value > current.max) {
current.max = value;
current.keys.clear();
}
current.keys.add(next.getKey());
}
}, (a, b) -> {
if(a.max <= b.max) return b;
if(a.max == b.max) a.keys.addAll(b.keys);
return a;
},
a -> a.keys);
}
The takeaway is, you don’t need to create a new, named class to create a Collector.
I want to focus the wording of one point of your question, because I feel like it could be the crux of the underlying confusion.
If I want to create my custom collector, I need to create two classes:
one for the custom accumulation type
one for the custom collector itself
No, you need to create only one class, that of your custom accumulator. You should use the appropriate factory method to instantiate your custom Collector, as you demonstrate yourself in the question.
Perhaps you meant to say that you need to create two instances. And that is also incorrect; you need to create a Collector instance, but to support the general case, many instances of the accumulator can be created (e.g., groupingBy()). Thus, you can't simply instantiate the accumulator yourself, you need to provide its Supplier to the Collector, and delegate to the Collector the ability to instantiate as many instances as required.
Now, think about the overloaded Collectors.of() method you feel is missing, the "more direct way to do this." Clearly, such a method would still require a Supplier, one that would create instances of your custom accumulator. But Stream.collect() needs to interact with your custom accumulator instances, to perform accumulate and combine operations. So the Supplier would have to instantiate something like this Accumulator interface:
public interface Accumulator<T, A extends Accumulator<T, A, R>, R> {
/**
* #param t a value to be folded into this mutable result container
*/
void accumulate(T t);
/**
* #param that another partial result to be merged with this container
* #return the combined results, which may be {#code this}, {#code that}, or a new container
*/
A combine(A that);
/**
* #return the final result of transforming this intermediate accumulator
*/
R finish();
}
With that, it's then straightforward to create Collector instances from an Supplier<Accumulator>:
static <T, A extends Accumulator<T, A, R>, R>
Collector<T, ?, R> of(Supplier<A> supplier, Collector.Characteristics ... characteristics) {
return Collector.of(supplier,
Accumulator::accumulate,
Accumulator::combine,
Accumulator::finish,
characteristics);
}
Then, you'd be able to define your custom Accumulator:
final class Sum implements Accumulator<Integer, Sum, String> {
private int value;
#Override
public void accumulate(Integer next) {
value += next;
}
#Override
public Sum combine(Sum that) {
value += that.value;
return this;
}
#Override
public String finish(){
return Integer.toString(value);
}
}
And use it:
String sum = ints.stream().collect(Accumulator.of(Sum::new, Collector.Characteristics.UNORDERED));
Now… it works, and there's nothing too horrible about it, but is all the Accumulator<A extends Accumulator<A>> mumbo-jumbo "more direct" than this?
final class Sum {
private int value;
private void accumulate(Integer next) {
value += next;
}
private Sum combine(Sum that) {
value += that.value;
return this;
}
#Override
public String toString() {
return Integer.toString(value);
}
static Collector<Integer, ?, String> collector() {
return Collector.of(Sum::new, Sum::accumulate, Sum::combine, Sum::toString, Collector.Characteristics.UNORDERED);
}
}
And really, why have an Accumulator dedicated to collecting to a String? Wouldn't reduction to a custom type be more interesting? Something that along the lines of IntSummaryStatistics that has other useful methods like average() alongside toString()? This approach is a lot more powerful, requires only one (mutable) class (the result type) and can encapsulate all of its mutators as private methods rather than implementing a public interface.
So, you're welcome to use something like Accumulator, but it doesn't really fill a real gap in the core Collector repertoire.
It sounds like you want to supply only the reduction function itself, not all of the other things that come with a generic Collector. Perhaps you're looking for Collectors.reducing.
public static <T> Collector<T,?,T> reducing(T identity, BinaryOperator<T> op)
Then, to sum values, you would write
Collectors.reducing(0, (x, y) -> x + y);
or, in context,
Integer[] myList = new Integer[] { 1, 2, 3, 4 };
var collector = Collectors.reducing(0, (x, y) -> x + y);
System.out.println(Stream.of(myList).collect(collector)); // Prints 10
I'm open to use a lib. I just want something simple to diff two collections on a different criteria than the normal equals function.
Right now I use something like :
collection1.stream()
.filter(element -> !collection2.stream()
.anyMatch(element2 -> element2.equalsWithoutSomeField(element)))
.collect(Collectors.toSet());
and I would like something like :
Collections.diff(collection1, collection2, Foo::equalsWithoutSomeField);
(edit) More context:
Should of mentioned that I'm looking for something that exists already and not to code it myself. I might code a small utils from your ideas if nothing exists.
Also, Real duplicates aren't possible in my case: the collections are Sets. However, duplicates according to the custom equals are possible and should not be removed by this operation. It seems to be a limitation in a lot of possible solutions.
We use similar methods in our project to shorten repetitive collection filtering. We started with some basic building blocks:
static <T> boolean anyMatch(Collection<T> set, Predicate<T> match) {
for (T object : set)
if (match.test(object))
return true;
return false;
}
Based on this, we can easily implement methods like noneMatch and more complicated ones like isSubset or your diff:
static <E> Collection<E> disjunctiveUnion(Collection<E> c1, Collection<E> c2, BiPredicate<E, E> match)
{
ArrayList<E> diff = new ArrayList<>();
diff.addAll(c1);
diff.addAll(c2);
diff.removeIf(e -> anyMatch(c1, e1 -> match.test(e, e1))
&& anyMatch(c2, e2 -> match.test(e, e2)));
return diff;
}
Note that there are for sure some possibilities for perfomance tuning. But keeping it separated into small methods help understanding and using them with ease. Used in code they read quite nice.
You would then use it as you already said:
CollectionUtils.disjunctiveUnion(collection1, collection2, Foo::equalsWithoutSomeField);
Taking Jose Da Silva's suggestion into account, you could even use Comparator to build your criteria on the fly:
Comparator<E> special = Comparator.comparing(Foo::thisField)
.thenComparing(Foo::thatField);
BiPredicate specialMatch = (e1, e2) -> special.compare(e1, e2) == 0;
You can use UnifiedSetWithHashingStrategy from Eclipse Collections. UnifiedSetWithHashingStrategy allows you to create a Set with a custom HashingStrategy.
HashingStrategy allows the user to use a custom hashCode() and equals(). The Object's hashCode() and equals() is not used.
Edit based on requirement from OP via comment:
You can use reject() or removeIf() depending on your requirement.
Code Example:
// Common code
Person person1 = new Person("A", "A");
Person person2 = new Person("B", "B");
Person person3 = new Person("C", "A");
Person person4 = new Person("A", "D");
Person person5 = new Person("E", "E");
MutableSet<Person> personSet1 = Sets.mutable.with(person1, person2, person3);
MutableSet<Person> personSet2 = Sets.mutable.with(person2, person4, person5);
HashingStrategy<Person> hashingStrategy =
HashingStrategies.fromFunction(Person::getLastName);
1) Using reject(): Creates a new Set which contains all the elements which do not satisfy the Predicate.
#Test
public void reject()
{
MutableSet<Person> personHashingStrategySet = HashingStrategySets.mutable.withAll(
hashingStrategy, personSet2);
// reject creates a new copy
MutableSet<Person> rejectSet = personSet1.reject(personHashingStrategySet::contains);
Assert.assertEquals(Sets.mutable.with(person1, person3), rejectSet);
}
2) Using removeIf(): Mutates the original Set by removing the elements which satisfy the Predicate.
#Test
public void removeIfTest()
{
MutableSet<Person> personHashingStrategySet = HashingStrategySets.mutable.withAll(
hashingStrategy, personSet2);
// removeIf mutates the personSet1
personSet1.removeIf(personHashingStrategySet::contains);
Assert.assertEquals(Sets.mutable.with(person1, person3), personSet1);
}
Answer before requirement from OP via comment: Kept for reference if others might find it useful.
3) Using Sets.differenceInto() API available in Eclipse Collections:
In the code below, set1 and set2 are the two sets which use Person's equals() and hashCode(). The differenceSet is a UnifiedSetWithHashingStrategy so, it uses the lastNameHashingStrategy to define uniqueness. Hence, even though set2 does not contain person3 however it has the same lastName as person1 the differenceSet contains only person1.
#Test
public void differenceTest()
{
MutableSet<Person> differenceSet = Sets.differenceInto(
HashingStrategySets.mutable.with(hashingStrategy),
set1,
set2);
Assert.assertEquals(Sets.mutable.with(person1), differenceSet);
}
Person class common to both code blocks:
public class Person
{
private final String firstName;
private final String lastName;
public Person(String firstName, String lastName)
{
this.firstName = firstName;
this.lastName = lastName;
}
public String getFirstName()
{
return firstName;
}
public String getLastName()
{
return lastName;
}
#Override
public boolean equals(Object o)
{
if (this == o)
{
return true;
}
if (o == null || getClass() != o.getClass())
{
return false;
}
Person person = (Person) o;
return Objects.equals(firstName, person.firstName) &&
Objects.equals(lastName, person.lastName);
}
#Override
public int hashCode()
{
return Objects.hash(firstName, lastName);
}
}
Javadocs: MutableSet, UnifiedSet, UnifiedSetWithHashingStrategy, HashingStrategy, Sets, reject, removeIf
Note: I am a committer on Eclipse Collections
static <T> Collection<T> diff(Collection<T> minuend, Collection<T> subtrahend, BiPredicate<T, T> equals) {
Set<Wrapper<T>> w1 = minuend.stream().map(item -> new Wrapper<>(item, equals)).collect(Collectors.toSet());
Set<Wrapper<T>> w2 = subtrahend.stream().map(item -> new Wrapper<>(item, equals)).collect(Collectors.toSet());
w1.removeAll(w2);
return w1.stream().map(w -> w.item).collect(Collectors.toList());
}
static class Wrapper<T> {
T item;
BiPredicate<T, T> equals;
Wrapper(T item, BiPredicate<T, T> equals) {
this.item = item;
this.equals = equals;
}
#Override
public int hashCode() {
// all items have same hash code, check equals
return 1;
}
#Override
public boolean equals(Object that) {
return equals.test(this.item, ((Wrapper<T>) that).item);
}
}
Comparing
You can achieve this without the use of any library, just using java's Comparator
For instance, with the following object
public class A {
private String a;
private Double b;
private String c;
private int d;
// getters and setters
}
You can use a comparator like
Comparator<AA> comparator = Comparator.comparing(AA::getA)
.thenComparing(AA::getB)
.thenComparingInt(AA::getD);
This compares the fields a, b and the int d, skipping c.
The only problem here is that this won't work with null values.
Comparing nulls
One possible solution to do a fine grained configuration, that is allow to check for specific null fields is using a Comparator class similar to:
// Comparator for properties only, only writed to be used with Comparator#comparing
public final class PropertyNullComparator<T extends Comparable<? super T>>
implements Comparator<Object> {
private PropertyNullComparator() { }
public static <T extends Comparable<? super T>> PropertyNullComparator<T> of() {
return new PropertyNullComparator<>();
}
#Override
public int compare(Object o1, Object o2) {
if (o1 != null && o2 != null) {
if (o1 instanceof Comparable) {
#SuppressWarnings({ "unchecked" })
Comparable<Object> comparable = (Comparable<Object>) o1;
return comparable.compareTo(o2);
} else {
// this will throw a ccn exception when object is not comparable
#SuppressWarnings({ "unchecked" })
Comparable<Object> comparable = (Comparable<Object>) o2;
return comparable.compareTo(o1) * -1; // * -1 to keep order
}
} else {
return o1 == o2 ? 0 : (o1 == null ? -1 : 1); // nulls first
}
}
}
This way you can use a comparator specifying the allowed null fields.
Comparator<AA> comparator = Comparator.comparing(AA::getA)
.thenComparing(AA::getB, PropertyNullComparator.of())
.thenComparingInt(AA::getD);
If you don't want to define a custom comparator you can use something like:
Comparator<AA> comparator = Comparator.comparing(AA::getA)
.thenComparing(AA::getB, Comparator.nullsFirst(Comparator.naturalOrder()))
.thenComparingInt(AA::getD);
Difference method
The difference (A - B) method could be implemented using two TreeSets.
static <T> TreeSet<T> difference(Collection<T> c1,
Collection<T> c2,
Comparator<T> comparator) {
TreeSet<T> treeSet1 = new TreeSet<>(comparator); treeSet1.addAll(c1);
if (treeSet1.size() > c2.size()) {
treeSet1.removeAll(c2);
} else {
TreeSet<T> treeSet2 = new TreeSet<>(comparator); treeSet2.addAll(c2);
treeSet1.removeAll(treeSet2);
}
return treeSet1;
}
note: a TreeSet makes sense to be used since we are talking of uniqueness with a specific comparator. Also could perform better, the contains method of TreeSet is O(log(n)), compared to a common ArrayList that is O(n).
Why only a TreeSet is used when treeSet1.size() > c2.size(), this is because when the condition is not met, the TreeSet#removeAll, uses the contains method of the second collection, this second collection could be any java collection and its contains method its not guaranteed to work exactly the same as the contains of the first TreeSet (with custom comparator).
Edit (Given the more context of the question)
Since collection1 is a set that could contains repeated elements acording to the custom equals (not the equals of the object) the solution already provided in the question could be used, since it does exactly that, without modifying any of the input collections and creating a new output set.
So you can create your own static function (because at least i am not aware of a library that provides a similar method), and use the Comparator or a BiPredicate.
static <T> Set<T> difference(Collection<T> collection1,
Collection<T> collection2,
Comparator<T> comparator) {
collection1.stream()
.filter(element1 -> !collection2.stream()
.anyMatch(element2 -> comparator.compare(element1, element2) == 0))
.collect(Collectors.toSet());
}
Edit (To Eugene)
"Why would you want to implement a null safe comparator yourself"
At least to my knowledge there isn't a comparator to compare fields when this are a simple and common null, the closest that i know of is (to raplace my sugested PropertyNullComparator.of() [clearer/shorter/better name can be used]):
Comparator.nullsFirst(Comparator.naturalOrder())
So you would have to write that line for every field that you want to compare. Is this doable?, of course it is, is it practical?, i think not.
Easy solution, create a helper method.
static class ComparatorUtils {
public static <T extends Comparable<? super T>> Comparator<T> shnp() { // super short null comparator
return Comparator.nullsFirst(Comparator.<T>naturalOrder());
}
}
Do this work?, yes this works, is it practical?, it looks like, is it a great solution? well that depends, many people consider the exaggerated (and/or unnecessary) use of helper methods as an anti-pattern, (a good old article by Nick Malik). There are some reasons listed there, but to make things short, this is an OO language, so OO solutions are normally preferred to static helper methods.
"As stated in the documentation : Note that the ordering maintained by a set (whether or not an explicit comparator is provided must be consistent with equals if it is to correctly implement the Set interface. Further, the same problem would arise in the other case, when size() > c.size() because ultimately this would still call equals in the remove method. So they both have to implement Comparator and equals consistently for this to work correctly"
The javadoc says of TreeSet the following, but with a clear if:
Note that the ordering maintained by a set (whether or not an explicit comparator is provided) must be consistent with equals if it is to correctly implement the Set interface
Then says this:
See Comparable or Comparator for a precise definition of consistent with equals
If you go to the Comparable javadoc says:
It is strongly recommended (though not required) that natural orderings be consistent with equals
If we continue to read the javadoc again from Comparable (even in the same paragraph) says the following:
This is so because the Set interface is defined in terms of the equals operation, but a TreeSet instance performs all key comparisons using its compareTo (or compare ) method, so two keys that are deemed equal by this method are, from the standpoint of the set, equal. The behavior of a set is well-defined even if its ordering is inconsistent with equals; it just fails to obey the general contract of the Set interface.
By this last quote and with a very simple code debug, or even a reading, you can see the use of an internal TreeMap, and that all its derivated methods are based on the comparator, not the equals method;
"Why is this so implemented? because there is a difference when removing many elements from a little set and the other way around, as a matter of fact same stands for addAll"
If you go to the definition of removeAll you can see that its implementation is in AbstractSet, it is not overrided. And this implementation uses a contains from the argument collection when this is larger, the beavior of this contains is uncertain, it isn't necessary (nor probable) that the received collection (e.g. list, queue, etc) has/can define the same comparator.
Update 1:
This jdk bug is being discussed (and considerated to be fixed) in here https://bugs.openjdk.java.net/browse/JDK-6394757
pom.xml:
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-collections4</artifactId>
<version>4.4</version>
</dependency>
code/test:
package com.my;
import lombok.Builder;
import lombok.Getter;
import lombok.ToString;
import org.apache.commons.collections4.CollectionUtils;
import org.apache.commons.collections4.Equator;
import java.util.Collection;
import java.util.HashSet;
import java.util.Objects;
import java.util.Set;
import java.util.function.Function;
public class Diff {
public static class FieldEquator<T> implements Equator<T> {
private final Function<T, Object>[] functions;
#SafeVarargs
public FieldEquator(Function<T, Object>... functions) {
if (Objects.isNull(functions) || functions.length < 1) {
throw new UnsupportedOperationException();
}
this.functions = functions;
}
#Override
public boolean equate(T o1, T o2) {
if (Objects.isNull(o1) && Objects.isNull(o2)) {
return true;
}
if (Objects.isNull(o1) || Objects.isNull(o2)) {
return false;
}
for (Function<T, ?> function : functions) {
if (!Objects.equals(function.apply(o1), function.apply(o2))) {
return false;
}
}
return true;
}
#Override
public int hash(T o) {
if (Objects.isNull(o)) {
return -1;
}
int i = 0;
Object[] vals = new Object[functions.length];
for (Function<T, Object> function : functions) {
vals[i] = function.apply(o);
i++;
}
return Objects.hash(vals);
}
}
#SafeVarargs
private static <T> Set<T> difference(Collection<T> a, Collection<T> b, Function<T, Object>... functions) {
if ((Objects.isNull(a) || a.isEmpty()) && Objects.nonNull(b) && !b.isEmpty()) {
return new HashSet<>(b);
} else if ((Objects.isNull(b) || b.isEmpty()) && Objects.nonNull(a) && !a.isEmpty()) {
return new HashSet<>(a);
}
Equator<T> eq = new FieldEquator<>(functions);
Collection<T> res = CollectionUtils.removeAll(a, b, eq);
res.addAll(CollectionUtils.removeAll(b, a, eq));
return new HashSet<>(res);
}
/**
* Test
*/
#Builder
#Getter
#ToString
public static class A {
String a;
String b;
String c;
}
public static void main(String[] args) {
Set<A> as1 = new HashSet<>();
Set<A> as2 = new HashSet<>();
A a1 = A.builder().a("1").b("1").c("1").build();
A a2 = A.builder().a("1").b("1").c("2").build();
A a3 = A.builder().a("2").b("1").c("1").build();
A a4 = A.builder().a("1").b("3").c("1").build();
A a5 = A.builder().a("1").b("1").c("1").build();
A a6 = A.builder().a("1").b("1").c("2").build();
A a7 = A.builder().a("1").b("1").c("6").build();
as1.add(a1);
as1.add(a2);
as1.add(a3);
as2.add(a4);
as2.add(a5);
as2.add(a6);
as2.add(a7);
System.out.println("Set1: " + as1);
System.out.println("Set2: " + as2);
// Check A::getA, A::getB ignore A::getC
Collection<A> difference = difference(as1, as2, A::getA, A::getB);
System.out.println("Diff: " + difference);
}
}
result:
Set1: [Diff.A(a=2, b=1, c=1), Diff.A(a=1, b=1, c=1), Diff.A(a=1, b=1, c=2)]
Set2: [Diff.A(a=1, b=1, c=6), Diff.A(a=1, b=1, c=2), Diff.A(a=1, b=3, c=1), Diff.A(a=1, b=1, c=1)]
Diff: [Diff.A(a=1, b=3, c=1), Diff.A(a=2, b=1, c=1)]
Is any easiest way to write this code below, without using toStream()?
import io.vavr.collection.List;
import io.vavr.control.Option;
import lombok.Value;
public class VavrDemo {
public static void main(String[] args) {
Foo bar = new Foo(List.of(new Bar(1), new Bar(2)));
Number value = Option.some(bar)
.toStream() // <- WTF?!?
.flatMap(Foo::getBars)
.map(Bar::getValue)
.sum();
System.out.println(value);
}
#Value
static class Foo {
private List<Bar> bars;
}
#Value
static class Bar {
private int value;
}
}
Option is a so-called Monad. This just tells us that the flatMap function follows specific laws, namely
Let
A, B, C be types
unit: A -> Monad<A> a constructor
f: A -> Monad<B>, g: B -> Monad<C> functions
a be an object of type A
m be an object of type Monad<A>
Then all instances of the Monad interface should obey the Functor laws (omitted here) and the three control laws:
Left identity: unit(a).flatMap(f) ≡ f a
Right identity: m.flatMap(unit) ≡ m
Associativity: m.flatMap(f).flatMap(g) ≡ m.flatMap(x -> f.apply(x).flatMap(g))
Currently Vavr has (simplified):
interface Option<T> {
<U> Option<U> flatMap(Function<T, Option<U>> mapper) {
return isEmpty() ? none() : mapper.apply(get());
}
}
This version obeys the Monad laws.
It is not possible to define an Option.flatMap the way you want that still obeys the Monad laws. For example imagine a flatMap version that accepts a function with an Iterable as result. All Vavr collections have such a flatMap method but for Option it does not make sense:
interface Option<T> {
<U> Option<U> flatMap(Function<T, Iterable<U>> mapper) {
if (isEmpty()) {
return none();
} else {
Iterable<U> iterable = mapper.apply(get());
if (isEmpty(iterable)) {
return none();
} else {
U resultValue = whatToDoWith(iterable); // ???
return some(resultValue);
}
}
}
}
You see? The best thing we can do is to take just one element of the iterable in case it is not empty. Beside it does not give use the result you may have expected (in VavrTest above), we can proof that this 'phantasy' version of flatMap does break the Monad laws.
If you are stuck in such a situation, consider to change your calls slightly. For example the VavrTest can be expressed like this:
Number value = Option.some(bar)
.map(b -> b.getBars().map(Bar::getValue).sum())
.getOrElse(0);
I hope this helps and the Monad section above does not completely scare you away. In fact, developers do not need to know anything about Monads in order to take advantage of Vavr.
Disclaimer: I'm the creator of Vavr (formerly: Javaslang)
How about using .fold() or .getOrElse()?
Option.some(bar)
.fold(List::<Bar>empty, Foo::getBars)
.map(Bar::getValue)
.sum();
Option.some(bar)
.map(Foo::getBars)
.getOrElse(List::empty)
.map(Bar::getValue)
.sum();
Given the following code:
stream.filter(o1 -> Objects.equals(o1.getSome().getSomeOther(),
o2.getSome().getSomeOther())
How could that possibly be simplified?
Is there some equals-utility that lets you first extract a key just like there is Comparator.comparing which accepts a key extractor function?
Note that the code itself (getSome().getSomeOther()) is actually generated from a schema.
EDIT: (after discussing with a collegue and after revisiting: Is there a convenience method to create a Predicate that tests if a field equals a given value?)
We now have come to the following reusable functional interface:
#FunctionalInterface
public interface Property<T, P> {
P extract(T object);
default Predicate<T> like(T example) {
Predicate<P> equality = Predicate.isEqual(extract(example));
return (value) -> equality.test(extract(value));
}
}
and the following static convenience method:
static <T, P> Property<T, P> property(Property<T, P> property) {
return property;
}
The filtering now looks like:
stream.filter(property(t -> t.getSome().getSomeOther()).like(o2))
What I like on this solution in respect to the solution before: it clearly separates the extraction of the property and the creation of the Predicate itself and it states more clearly what is going on.
Previous solution:
<T, U> Predicate<T> isEqual(T other, Function<T, U> keyExtractFunction) {
U otherKey = keyExtractFunction.apply(other);
return t -> Objects.equals(keyExtractFunction.apply(t), otherKey);
}
which results in the following usage:
stream.filter(isEqual(o2, t -> t.getSome().getSomeOther())
but I am more then happy if anyone has a better solution.
I think that your question's approach is more readable than your answer's one. And I also think that using inline lambdas is fine, as long as the lambda is simple and short.
However, for maintainance, readability, debugging and testability reasons, I always move the logic I'd use in a lambda (either a predicate or function) to one or more methods. In your case, I would do:
class YourObject {
private Some some;
public boolean matchesSomeOther(YourObject o2) {
return this.getSome().matchesSomeOther(o2.getSome());
}
}
class Some {
private SomeOther someOther;
public boolean matchesSomeOther(Some some2) {
return Objects.isEqual(this.getSomeOther(), some2.getSomeOther());
}
}
With these methods in place, your predicate now becomes trivial:
YourClass o2 = ...;
stream.filter(o2::matchesSomeOther)
In Java 8, you can use a method reference to filter a stream, for example:
Stream<String> s = ...;
long emptyStrings = s.filter(String::isEmpty).count();
Is there a way to create a method reference that is the negation of an existing one, i.e. something like:
long nonEmptyStrings = s.filter(not(String::isEmpty)).count();
I could create the not method like below but I was wondering if the JDK offered something similar.
static <T> Predicate<T> not(Predicate<T> p) { return o -> !p.test(o); }
Predicate.not( … )
java-11 offers a new method Predicate#not
So you can negate the method reference:
Stream<String> s = ...;
long nonEmptyStrings = s.filter(Predicate.not(String::isEmpty)).count();
I'm planning to static import the following to allow for the method reference to be used inline:
public static <T> Predicate<T> not(Predicate<T> t) {
return t.negate();
}
e.g.
Stream<String> s = ...;
long nonEmptyStrings = s.filter(not(String::isEmpty)).count();
Update: Starting from Java-11, the JDK offers a similar solution built-in as well.
There is a way to compose a method reference that is the opposite of a current method reference. See #vlasec's answer below that shows how by explicitly casting the method reference to a Predicate and then converting it using the negate function. That is one way among a few other not too troublesome ways to do it.
The opposite of this:
Stream<String> s = ...;
int emptyStrings = s.filter(String::isEmpty).count();
is this:
Stream<String> s = ...;
int notEmptyStrings = s.filter(((Predicate<String>) String::isEmpty).negate()).count()
or this:
Stream<String> s = ...;
int notEmptyStrings = s.filter( it -> !it.isEmpty() ).count();
Personally, I prefer the later technique because I find it clearer to read it -> !it.isEmpty() than a long verbose explicit cast and then negate.
One could also make a predicate and reuse it:
Predicate<String> notEmpty = (String it) -> !it.isEmpty();
Stream<String> s = ...;
int notEmptyStrings = s.filter(notEmpty).count();
Or, if having a collection or array, just use a for-loop which is simple, has less overhead, and *might be **faster:
int notEmpty = 0;
for(String s : list) if(!s.isEmpty()) notEmpty++;
*If you want to know what is faster, then use JMH http://openjdk.java.net/projects/code-tools/jmh, and avoid hand benchmark code unless it avoids all JVM optimizations — see Java 8: performance of Streams vs Collections
**I am getting flak for suggesting that the for-loop technique is faster. It eliminates a stream creation, it eliminates using another method call (negative function for predicate), and it eliminates a temporary accumulator list/counter. So a few things that are saved by the last construct that might make it faster.
I do think it is simpler and nicer though, even if not faster. If the job calls for a hammer and a nail, don't bring in a chainsaw and glue! I know some of you take issue with that.
wish-list: I would like to see Java Stream functions evolve a bit now that Java users are more familiar with them. For example, the 'count' method in Stream could accept a Predicate so that this can be done directly like this:
Stream<String> s = ...;
int notEmptyStrings = s.count(it -> !it.isEmpty());
or
List<String> list = ...;
int notEmptyStrings = lists.count(it -> !it.isEmpty());
Predicate has methods and, or and negate.
However, String::isEmpty is not a Predicate, it's just a String -> Boolean lambda and it could still become anything, e.g. Function<String, Boolean>. Type inference is what needs to happen first. The filter method infers type implicitly. But if you negate it before passing it as an argument, it no longer happens. As #axtavt mentioned, explicit inference can be used as an ugly way:
s.filter(((Predicate<String>) String::isEmpty).negate()).count()
There are other ways advised in other answers, with static not method and lambda most likely being the best ideas. This concludes the tl;dr section.
However, if you want some deeper understanding of lambda type inference, I'd like to explain it a bit more to depth, using examples. Look at these and try to figure out what happens:
Object obj1 = String::isEmpty;
Predicate<String> p1 = s -> s.isEmpty();
Function<String, Boolean> f1 = String::isEmpty;
Object obj2 = p1;
Function<String, Boolean> f2 = (Function<String, Boolean>) obj2;
Function<String, Boolean> f3 = p1::test;
Predicate<Integer> p2 = s -> s.isEmpty();
Predicate<Integer> p3 = String::isEmpty;
obj1 doesn't compile - lambdas need to infer a functional interface (= with one abstract method)
p1 and f1 work just fine, each inferring a different type
obj2 casts a Predicate to Object - silly but valid
f2 fails at runtime - you cannot cast Predicate to Function, it's no longer about inference
f3 works - you call the predicate's method test that is defined by its lambda
p2 doesn't compile - Integer doesn't have isEmpty method
p3 doesn't compile either - there is no String::isEmpty static method with Integer argument
Building on other's answers and personal experience:
Predicate<String> blank = String::isEmpty;
content.stream()
.filter(blank.negate())
Another option is to utilize lambda casting in non-ambiguous contexts into one class:
public static class Lambdas {
public static <T> Predicate<T> as(Predicate<T> predicate){
return predicate;
}
public static <T> Consumer<T> as(Consumer<T> consumer){
return consumer;
}
public static <T> Supplier<T> as(Supplier<T> supplier){
return supplier;
}
public static <T, R> Function<T, R> as(Function<T, R> function){
return function;
}
}
... and then static import the utility class:
stream.filter(as(String::isEmpty).negate())
Shouldn't Predicate#negate be what you are looking for?
In this case u could use the org.apache.commons.lang3.StringUtilsand do
int nonEmptyStrings = s.filter(StringUtils::isNotEmpty).count();
I have written a complete utility class (inspired by Askar's proposal) that can take Java 8 lambda expression and turn them (if applicable) into any typed standard Java 8 lambda defined in the package java.util.function. You can for example do:
asPredicate(String::isEmpty).negate()
asBiPredicate(String::equals).negate()
Because there would be numerous ambiguities if all the static methods would be named just as(), I opted to call the method "as" followed by the returned type. This gives us full control of the lambda interpretation. Below is the first part of the (somewhat large) utility class revealing the pattern used.
Have a look at the complete class here (at gist).
public class FunctionCastUtil {
public static <T, U> BiConsumer<T, U> asBiConsumer(BiConsumer<T, U> biConsumer) {
return biConsumer;
}
public static <T, U, R> BiFunction<T, U, R> asBiFunction(BiFunction<T, U, R> biFunction) {
return biFunction;
}
public static <T> BinaryOperator<T> asBinaryOperator(BinaryOperator<T> binaryOperator) {
return binaryOperator;
}
... and so on...
}
You can use Predicates from Eclipse Collections
MutableList<String> strings = Lists.mutable.empty();
int nonEmptyStrings = strings.count(Predicates.not(String::isEmpty));
If you can't change the strings from List:
List<String> strings = new ArrayList<>();
int nonEmptyStrings = ListAdapter.adapt(strings).count(Predicates.not(String::isEmpty));
If you only need a negation of String.isEmpty() you can also use StringPredicates.notEmpty().
Note: I am a contributor to Eclipse Collections.
You can accomplish this as long emptyStrings = s.filter(s->!s.isEmpty()).count();
Tip: to negate a collection.stream().anyMatch(...), one can use collection.stream().noneMatch(...)
If you're using Spring Boot (2.0.0+) you can use:
import org.springframework.util.StringUtils;
...
.filter(StringUtils::hasLength)
...
Which does:
return (str != null && !str.isEmpty());
So it will have the required negation effect for isEmpty