Java Stream.builder() in Collector null problem

Java Stream.builder() in Collector null problem - java

The aim is to use a Stream to iterate over an array, filtering/extending values as required and collecting the result in a new Stream.
Trying to use Stream.builder(), as in the following three examples, I'll always get a Stream with the expected Strings, but lots of trailing nulls. In addition, I can't process null elements this way.
I suspect, the internal fixed buffer in Stream.builder() is the problem.
Is there a way, to prevent 'trailing' nulls with this approach, without loosing the ability to use null values as Stream elements?
String[] whitespaces = new String[] { " ", "\n", "\r", "\t" };
int len = whitespaces.length;
boolean addNulls = false;
int flexBoundary = addNulls ? len : len - 1;
Stream<String> whitespaceNullStringStream = IntStream.rangeClosed(0, flexBoundary)
.mapToObj(idx ->
addNulls && idx == flexBoundary
? null
: whitespaces[idx])
// #1
.collect(Stream::<String>builder, Builder::add, (b1, b2) -> Stream.concat(b1.build(), b2.build())).build();
// #2
// .collect(Stream::<String>builder, Builder::add, (b1, b2) -> Stream.builder().add(b1).add(b2)).build();
// #3
// .collect(
// Collector.of(
// Stream::<String>builder,
// Builder::add,
// (b1, b2) -> b1.add(b2.build().reduce(String::concat).get()),
// Builder::build
// )
// );
If I instead use the following, it'll work as expected, except null values are converted to Strings, of course, which is not desirable here:
.collect(
Collector.of(
StringBuilder::new,
StringBuilder::append,
StringBuilder::append,
(sb) -> Stream.of(sb.toString())
)
)
To overcome this, I've used the following approach:
Stream<String> stream = IntStream.rangeClosed(0, flexBoundary)
.mapToObj(idx -> addNulls && idx == flexBoundary ? null : whitespaces[idx])
.collect(Collector.of(
ArrayList<String>::new,
List::add,
(l1, l2) -> { l1.addAll(l2); return l1; },
(list) -> list.stream()
)
);
But, as described above, I'd like to use the Stream.builder() approach inside a Collector, which works the same.

Most of the stream API will fast-crash when null is involved. The things just aren't designed for it.
There are different ideas about what null actually means. Reductively, null in java means one of 3 things:
The field was not initialized (private String hello; starts out as null)
The array slot was never written to (new String[10] starts with 10 null values)
Somebody explicitly used null, the keyword.
But that's not too useful. Let's talk semantics. What does it mean when an API returns null for something, or when you use null in some code?
There are different semantic takes on it too:
It means: Not initialized, not applicable, unexpected, no result.
In this case, exceptions are good and anything you could possibly want here would be wrong. You can't ask "concatenate this unknown thing to this string". The correct answer is not to silently just skip it. The correct answer is to crash: You can't concatenate an unknown. This is what SQL does with null quite consistently, and is a usage of null in java that I strongly recommend. It turns nulls downsides into upside: You use null when you want that exception to occur if any code attempts to interact with the thing the pointer is pointing at (because the idea is: There is no value and the code flow should therefore not even check. If it does, there is a bug, and I would like an exception to occur exactly at the moment the bug is written please!).
In light of your code, if that's your interpretation of null, then your code is acting correctly, by throwing an exception.
It's a sentinel value for something else
This is also common: That null is being returned and that this has an explicit semantic meaning, hopefully described in the documentation. If you're ever written this statement:
if (x == null || x.isEmpty())
it is exceedingly likely you're using this semantic meaning of null. After all, that code says: "At least for the purposes of this if, there is no difference at all between an empty string and a null pointer.
I strongly recommend you never do this. It's not necessary (just return an empty string instead!!), and it leads to friction: If you have a method in a Person class named getTitle() that returns null when there is no title, and the project also states that title-less persons should just act as if the title is the empty string (Seems logical), then this is just wrong. Don't return null. Return "". After all, if I call person.getTitle().length(), then there is a undebatably correct answer to the question posed for someone with no title, and that is 0.
Sometimes, some system defines specific behaviour that strongly leans towards 'undefined/unknown/unset' behaviour for a given field. For example, let's say the rules are: If the person's .getStudentId() call returns a blank string that just means they don't have an ID yet. Then you should also never use null then. If a value can represent a thing, then it should represent that thing in only one way. Use null if you want exceptions if any code tries to ask anything at all about the nature of this value, use an existing value if one exists that does everything you want, and make a sentinel object that throws on certain calls but returns default values for others if you need really fine grained control.
Yes, if you ever write if (x == null || x.isEmpty()), that's right: That's a code smell. Code that is highly indicative of suboptimal design. (Exception: Boundary code. If you're receiving objects from a system or from code that isn't under your direct control, then you roll with the punches. But if their APIs are bad, you should write an in-between isolating layer that takes their badly designed stuff and translates it to well-designed stuff explicitly. That translation layer is allowed to write if (x == null || x.isEmpty()).
It sounds like this is the null you want: It sounds like you want the act of appending null to the stringbuilder to mean: "Just append nothing, then".
Thus, you want where you now have null to act as follows when you append that to a stringbuilder: To do nothing at all to that stringbuilder.
There is already an object that does what you want: It's "".
Thus:
mapToObj(idx ->
addNulls && idx == flexBoundary
? ""
: whitespaces[idx])
You might want to rename your addNulls variable to something else :)

Related

Why does Stream.reduce(BinaryOperator) throw NullPointer when the result is null?

I have a stream of enum values that I want to reduce. If the stream is empty or contains different values, I want null. If it only contains (multiple instances of) a single value, I want that value.
[] null
[A, B, A] null
[A] A
[A, A, A] A
I tried to do it with a reduce:
return <lots of filtering, mapping, and other stream stuff>
.reduce((enum1, enum2) -> enum1 == enum2 ? enum1 : null)
.orElse(null);
Unfortunately, this does not work, because this reduce method throws a NullPointerException when the result is null. Does anyone know why that happens? Why is null not a valid result?
For now, I solved this like this:
MyEnum[] array = <lots of filtering, mapping, and other stream stuff>
.distinct()
.toArray(MyEnum[]::new);
return array.length == 1 ? array[0] : null;
While this works, I am not satisfied with this "detour". I liked the reduce because it seemed to be the right fit and put everything into one stream.
Can anyone think of an alternative to the reduce (that ideally is not too much code)?

Generally, all Stream methods returning an Optional don’t allow null values as it would be impossible to tell the null result and “no result” (empty stream) apart.
You can work-around this with a place-holder value, which unfortunately requires to suspend the type safety (as there is no type-compatible value outside the enum set):
return <lots of filtering, mapping, and other stream stuff>
.reduce((enum1, enum2) -> enum1 == enum2? enum1: "")
.map(r -> r==""? null: (MyEnum)r)
.orElse(null);
Optional.map will return an empty optional if the mapping function returns null, so after that step, an empty stream and a null result can’t be distinguished anymore and orElse(null) will return null in both cases.
But maybe the array detour only feels to unsatisfying, because the array isn’t the best choice for the intermediate result? How about
EnumSet<MyEnum> set = <lots of filtering, mapping, and other stream stuff>
.collect(Collectors.toCollection(() -> EnumSet.noneOf(MyEnum.class)));
return set.size()==1? set.iterator().next(): null;
The EnumSet is only a bitset, a single long value if the enum type has not more than 64 constants. That’s much cheaper than an array and since Sets are naturally distinct, there is no need for a distinct() operation on the stream, which would create a HashSet under the hood.

Most stream higher-order functions don't allow null either as parameter or function return value. They are to prevent yet another billion-dollar mistake. Such response is documented here:
Optional reduce(BinaryOperator accumulator)
.....
Throws:
NullPointerException - if the result of the reduction is null

How about a really mathematical approach(hard to maintain I agree)?
Arrays.stream(array).map(e -> e.ordinal() + 1).reduce(Integer::sum)
.map(i -> (double) i / array.length == array[0].ordinal() + 1 ? array[0] : null)
.orElse(null)

What is the best way to check null in Java [duplicate]

This question already has answers here:
object==null or null==object?
(11 answers)
Closed 5 years ago.
When checking for nulls I use this:
String str;
if(str == null){
//...
}
but I've seen this as well:
if(null == str){
//...
}
Is there any advantage of using one over the other? Or is it just to improve readability?

The second version ( null == str ) is called a yoda condition.
They both result in the same behavior, but the second one has one advantage: It prevents you from accidentally changing a variable, when you forget one =. In that case the compiler returns an error at that row and you're not left with some weird behavior of your code and the resulting debugging.

The null == x convention is usually found in code written by people familiar with C, where an assignment can also be an expression. Some C programmers write code like this so that if they miss an = in
if (NULL == ptr)...
the code will not compile, since NULL = ptr is not a valid assignment. This prevents a rather sneaky error from being introduced in the code-base, although modern C compilers make make such conventions obsolete, as long as one takes care to enable and read the generated warnings...
This coding style has never had any usefulness in Java, where reference assignments cannot be used as boolean expressions. It could even be considered counter-intuitive; in their natural language most people will say "if X is null...", or "if X is equal to 17...", rather than "if null is equal to X...".

There's no difference between the two other than readability. Use whichever makes more sense to you.

As you stated readability is the most important reason. Reading it out loud, the (null == str) does not read well. It's almost like reading right to left. (str == null) reads much better.
In addition, I think the following needs to be taken into consideration:
if (str != null)
if (str == null)
vs.
if (null != str)
if (null == str)
I would expect the positive (str == null) and the negative to be written in the same manner, which is another reason I would favor the top set.

if (null == str) {
}
is a programming idiom from c/c++, where the assignment operator = can be used to resolve to a true/false statement. For example in c if you want to check if I can open a stream in c/c++, you can
if (myStream = openStream())
which sets opens and assigns in one line. However, this means that people often type = when they mean ==, and it would be valid syntax in c: for example if (x = 5) will always resolve to true, when they really mean if (x ==5). So people write if (5 == x) so if you leave out a = your code won't compile.
This doesn't apply to java.

There is no real difference. However the second is considered less error prone. In the first case you would not get an error if you tried to do
String str;
if(str = null){
}
which is something you usually don't do in conditionals.
Also, you get to think about the actual condition first, which is a good practice.

if(a==b) {} is the same as if(b==a) {} , and the same is true if b was null. It's just a style/order difference as far as functionality, at least in java.

Some developers argue that var == null is more error-prone than null == var. Their argument is that you might accidentally assign the variable instead of doing a null-check.
But only when the variable you test against null is a Boolean you can accidentally use the = instead of == and it will compile.
Boolean checked = Boolean.TRUE;
if(checked = null){ // accidentally assigned null and compiles
}
Only in this case the assignment compiles, because the conditional expression must evaluate to a boolean value. See JLS-14.9. Since the assignment expression itself evaluates to a boolean type, it compiles. But you will get a NullPointerException at runtume, because java will try to unbox the checked variable which is null.
If you use any other type then Boolean you will get a compiler error. E.g.
String str = "hello";
if(str = null){ // compiler error, because str = null doesn't evaluate to a boolean
}
My conclusion is that error situations are extremly rare and you can easily write unit tests that detect such errors.
So write the if-statement it in the way it is more readable.
I think "if name is null" makes more sense then "if null is name".

Several string inputs for one variable

Is there any other way to shorten this condition?
if (oper.equals("add") || oper.equals("Add") || oper.equals("addition") ||
oper.equals("Addition") || oper.equals("+"))
I was just wondering if there's something I can do to 'shortcut' this. The user will type a string when prompted what kind of operation is to be performed in my simple calculator program. Our professor said our program should accept whether the user enters "add", or "Add", you know, in lowercase letters or not... Or is the only way I should do it?

You can use String#equalsIgnoreCase(String) for 1st four strings: -
if (oper.equalsIgnoreCase("add") ||
oper.equalsIgnoreCase("addition") ||
oper.equals("+"))
If number of strings increases, you would be better off with a List, and use its contains method. But just for these inputs, you can follow this approach only.
Another way to approach this is to use String#matches(String) method, which takes a regex: -
if (oper.matches("add|addition|[+]")
But, you don't really need a regex for this. Specially, this method can become ugly for greater inputs. But, it's just a way for this case. So, you can choose either of them. 1st one is more clear to watch on first go.
Alternatively, you can also use enum to store operators, and pass it's instance everywhere, rather than a string. It would be more easy to work with. The enum would look like this:
public enum Operator {
ADD,
SUB,
MUL,
DIV;
}
You can enhance it to your appropriate need. Note that, since you are getting user input, you would first need to identify the appropriate enum instance based on it, and from there-on you can work on that enum instance, rather than String.

In addition to #Rohit's answer, I would like to add this.
In case of comparison of strings, if oper is null a NullPointerException could be thrown. SO its always better to write
"addition".equalsIgnoreCase(oper)
instead of
oper.equalsIgnoreCase("addition")

If aDD is considered as invalid input, you can consider following approach:
ArrayList<String> possibleInputs = new ArrayList<String>();
possibleInputs.add("Add");
possibleInputs.add("add");
possibleInputs.add("Addition");
possibleInputs.add("addition");
possibleInputs.add("+");
if(possibleInputs.contains(oper))
{
// ...
}

throw that whole bit of code inside a function called: isOperationAddition(String s){...} that returns a boolean.
So this:
if (oper.equals("add") || oper.equals("Add") || oper.equals("addition") ||
oper.equals("Addition") || oper.equals("+")){...}
Changes to this
if (isOperationAddition(operation)){...}
Then inside that method, don't use Strings as branch material for your if statements. Have a variable that defines which kind of operation it is and "Keep the barbarians (confusion/ambiguous users) out at the moat". You should not be always iterating against a list to remember what operation we are dealing with.

You can take the input, and convert in to lower case then compare.
str.toLowerCase()
then pass to your if() statement
if(str.equals("add") || str.equals("addition") || str.equals("+"))

Checking for null - what order? [duplicate]

This question already has answers here:
object==null or null==object?
(11 answers)
Closed 5 years ago.
When checking for nulls I use this:
String str;
if(str == null){
//...
}
but I've seen this as well:
if(null == str){
//...
}
Is there any advantage of using one over the other? Or is it just to improve readability?

The second version ( null == str ) is called a yoda condition.
They both result in the same behavior, but the second one has one advantage: It prevents you from accidentally changing a variable, when you forget one =. In that case the compiler returns an error at that row and you're not left with some weird behavior of your code and the resulting debugging.

The null == x convention is usually found in code written by people familiar with C, where an assignment can also be an expression. Some C programmers write code like this so that if they miss an = in
if (NULL == ptr)...
the code will not compile, since NULL = ptr is not a valid assignment. This prevents a rather sneaky error from being introduced in the code-base, although modern C compilers make make such conventions obsolete, as long as one takes care to enable and read the generated warnings...
This coding style has never had any usefulness in Java, where reference assignments cannot be used as boolean expressions. It could even be considered counter-intuitive; in their natural language most people will say "if X is null...", or "if X is equal to 17...", rather than "if null is equal to X...".

There's no difference between the two other than readability. Use whichever makes more sense to you.

As you stated readability is the most important reason. Reading it out loud, the (null == str) does not read well. It's almost like reading right to left. (str == null) reads much better.
In addition, I think the following needs to be taken into consideration:
if (str != null)
if (str == null)
vs.
if (null != str)
if (null == str)
I would expect the positive (str == null) and the negative to be written in the same manner, which is another reason I would favor the top set.

if (null == str) {
}
is a programming idiom from c/c++, where the assignment operator = can be used to resolve to a true/false statement. For example in c if you want to check if I can open a stream in c/c++, you can
if (myStream = openStream())
which sets opens and assigns in one line. However, this means that people often type = when they mean ==, and it would be valid syntax in c: for example if (x = 5) will always resolve to true, when they really mean if (x ==5). So people write if (5 == x) so if you leave out a = your code won't compile.
This doesn't apply to java.

There is no real difference. However the second is considered less error prone. In the first case you would not get an error if you tried to do
String str;
if(str = null){
}
which is something you usually don't do in conditionals.
Also, you get to think about the actual condition first, which is a good practice.

if(a==b) {} is the same as if(b==a) {} , and the same is true if b was null. It's just a style/order difference as far as functionality, at least in java.

Some developers argue that var == null is more error-prone than null == var. Their argument is that you might accidentally assign the variable instead of doing a null-check.
But only when the variable you test against null is a Boolean you can accidentally use the = instead of == and it will compile.
Boolean checked = Boolean.TRUE;
if(checked = null){ // accidentally assigned null and compiles
}
Only in this case the assignment compiles, because the conditional expression must evaluate to a boolean value. See JLS-14.9. Since the assignment expression itself evaluates to a boolean type, it compiles. But you will get a NullPointerException at runtume, because java will try to unbox the checked variable which is null.
If you use any other type then Boolean you will get a compiler error. E.g.
String str = "hello";
if(str = null){ // compiler error, because str = null doesn't evaluate to a boolean
}
My conclusion is that error situations are extremly rare and you can easily write unit tests that detect such errors.
So write the if-statement it in the way it is more readable.
I think "if name is null" makes more sense then "if null is name".

java : Handling Null check through a Combination

I am able to handle null check on a String with this below piece of code
if (acct != null && !acct.isEmpty()|| !acct.equals(""))
what i mean from the above code is , if
Accountid is not equal to null And
Accountid length is greater than 0
(These two is a combination of checks )
Or
Accountid is not equal to ""
Does my code satisfy these combination i mentioned above , or do i need to add any brackets ?? to satisfy the combination ( first 1 and 2 ) i mentioned above ??
Thanks

Yes it does, and is always evaluated before or, i.e. your code is the same as
if ((acct != null && !acct.isEmpty()) || !acct.equals(""))
However, logically it does not make sense to me. Do you really need the last part? Isn't "acct.isEmpty()" the same as "acct.equals(""))" in this specific instance?

isEmpty() and .equals("") are exactly the same condition. And your test will throw a NullPointerException if acct is null.
I don't understand exactly which test you want to make, but this one is wrong. Think about it once again, and implement a unit test to test all the cases:
null string,
empty string,
not empty string.

As per your question framed by you, it should have brackets as below
if ((acct != null && !acct.isEmpty()) || !("".equals(acct) ))
After the || operator, the code is changed which will avoid facing NullPointerException when acct is NULL.
This SO answer explains more about using "".equals().
https://stackoverflow.com/a/3321548/713414

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.