I have a class that counts the average number of words in a sentence using Lambdas in java. The problem that I'm having is that if corp is null or is empty I need to return 0. Currently I am getting NaN if corp is either null or empty. The rest of my code does what it should, but I cannot figure this part out.
public class AverageNumberOfWordsPerSentence extends TextMetric<Double> {
#Override
public Double apply(final Corpus corp) {
Sentences sentences = new Sentences();
List<String> sentenceList = sentences.apply(corp);
LongSummaryStatistics lss = corp.texts().stream()
.map(blob -> blob.text())
.flatMap(string -> stream
(string.split("\\W+")))
.filter(string -> !string.isEmpty())
.mapToLong(String::length)
.summaryStatistics();
return (double)lss.getCount() /
sentenceList.size();
}
Change the return statement to:
return sentenceList.isEmpty() ? 0.0 : (double)lss.getCount() / sentenceList.size();
And then hope that whoever told you “not to use control structures” will accept it. Strictly speaking, the ?: operator is a control structure, but it doesn’t have a keyword like if or while.
If I've got you right, then you need to use java.util.Optional:
class AverageNumberOfWordsPerSentence {
public Double apply(final Corpus corp) {
return Optional.of(corp).map(corp -> {
Sentences sentences = new Sentences();
List<String> sentenceList = sentences.apply(corp);
LongSummaryStatistics lss = corp.texts().stream()
.map(blob -> blob.text())
.flatMap(string -> stream
(string.split("\\W+")))
.filter(string -> !string.isEmpty())
.mapToLong(String::length)
.summaryStatistics();
return (double) lss.getCount() /
sentenceList.size();
}).orElse(0);
}
}
`
From the OP's comment,
Corpus corpus = new Corpus("King", text); So if the string where king is is empty or null then I have to return 0.
it appears that there needs to be some conditional logic that bypasses the stream if a member of Corpus is null or empty. The OP didn't say what the name of the property that holds "King" is, so I'll assume it is getKing() for now.
Like what #nikelin posted, Optional will help you here. Using Optional.filter() you can branch without using control structors. For example, you could do this to test to see if the "king" value is there and if it is null or an empty string, return 0, otherwise get the text metrics:
return Optional.of(corp)
.filter(c -> c.getKing() != null && !c.getKing().isEmpty()) // skip to the orElse() if it is null or empty)
.map(c -> c.texts()) // or .map(Corpus::texts)
.map(t -> t.stream()...blah blah get the word count...)
.map(count -> (double) count / sentences)
.orElse(0.0)
Any sequence of successive .map() operations can be combined into one, your choice.
If the initial Optional.filter finds that your "king" property is not null or empty, the stream operation the stream operation proceeds, getting the texts and calculating the word count as you specified already. It then maps the word count to sentenceCount/wordCount and returns that, but if your king property is null, the filter will leave the Optional empty, the map operations will be skipped, and the value in orElse(0.0) will be returned instead.
Related
I searched the site and didn't find something similar. I'm newbie to using the Java stream, but I understand that it's a replacement for a loop command. However, I would like to know if there is a way to filter a CSV file using stream, as shown below, where only the repeated records are included in the result and grouped by the Center field.
Initial CSV file
Final result
In addition, the same pair cannot appear in the final result inversely, as shown in the table below:
This shouldn't happen
Is there a way to do it using stream and grouping at the same time, since theoretically, two loops would be needed to perform the task?
Thanks in advance.
You can do it in one pass as a stream with O(n) efficiency:
class PersonKey {
// have a field for every column that is used to detect duplicates
String center, name, mother, birthdate;
public PersonKey(String line) {
// implement String constructor
}
// implement equals and hashCode using all fields
}
List<String> lines; // the input
Set<PersonKey> seen = new HashSet<>();
List<String> unique = lines.stream()
.filter(p -> !seen.add(new PersonKey(p))
.distinct()
.collect(toList());
The trick here is that a HashSet has constant time operations and its add() method returns false if the value being added is already in the set, true otherwise.
What I understood from your examples is you consider an entry as duplicate if all the attributes have same value except the ID. You can use anymatch for this:
list.stream().filter(x ->
list.stream().anyMatch(y -> isDuplicate(x, y))).collect(Collectors.toList())
So what does the isDuplicate(x,y) do?
This returns a boolean. You can check whether all the entries have same value except the id in this method:
private boolean isDuplicate(CsvEntry x, CsvEntry y) {
return !x.getId().equals(y.getId())
&& x.getName().equals(y.getName())
&& x.getMother().equals(y.getMother())
&& x.getBirth().equals(y.getBirth());
}
I've assumed you've taken all the entries as String. Change the checks according to the type. This will give you the duplicate entries with their corresponding ID
I was going through a tutorial of Optional class here - https://www.geeksforgeeks.org/java-8-optional-class/ which has the following
String[] words = new String[10];
Optional<String> checkNull = Optional.ofNullable(words[5]);
if (checkNull.isPresent()) {
String word = words[5].toLowerCase();
System.out.print(word);
} else{
System.out.println("word is null");
}
I am trying to make it of less lines using ifPresent check of Optional as
Optional.ofNullable(words[5]).ifPresent(a -> System.out.println(a.toLowerCase()))
but not able to get the else part further
Optional.ofNullable(words[5]).ifPresent(a -> System.out.println(a.toLowerCase())).orElse();// doesn't work```
Is there a way to do it?
Java-9
Java-9 introduced ifPresentOrElse for something similar in implementation. You could use it as :
Optional.ofNullable(words[5])
.map(String::toLowerCase) // mapped here itself
.ifPresentOrElse(System.out::println,
() -> System.out.println("word is null"));
Java-8
With Java-8, you shall include an intermediate Optional/String and use as :
Optional<String> optional = Optional.ofNullable(words[5])
.map(String::toLowerCase);
System.out.println(optional.isPresent() ? optional.get() : "word is null");
which can also be written as :
String value = Optional.ofNullable(words[5])
.map(String::toLowerCase)
.orElse("word is null");
System.out.println(value);
or if you don't want to store the value in a variable at all, use:
System.out.println(Optional.ofNullable(words[5])
.map(String::toLowerCase)
.orElse("word is null"));
For a bit to be more clear ifPresent will take Consumer as argument and return type is void, so you cannot perform any nested actions on this
public void ifPresent(Consumer<? super T> consumer)
If a value is present, invoke the specified consumer with the value, otherwise do nothing.
Parameters:
consumer - block to be executed if a value is present
Throws:
NullPointerException - if value is present and consumer is null
So instead of ifPreset() use map()
String result =Optional.ofNullable(words[5]).map(String::toLowerCase).orElse(null);
print Just to print
System.out.println(Optional.ofNullable(words[5]).map(String::toLowerCase).orElse(null));
If you are using java 9, you can use ifPresentOrElse() method::
https://docs.oracle.com/javase/9/docs/api/java/util/Optional.html#ifPresentOrElse-java.util.function.Consumer-java.lang.Runnable-
Optional.of(words[5]).ifPresentOrElse(
value -> System.out.println(a.toLowerCase()),
() -> System.out.println(null)
);
If Java 8 then look this great cheat sheet :
http://www.nurkiewicz.com/2013/08/optional-in-java-8-cheat-sheet.html
Suppose I have some file having some data in comma separated format as below
TIMESTAMP,COUNTRYCODE,RESPONSETIME,FLAG
1544190995,US,500,Y
1723922044,GB,370,N
1711557214,US,750,Y
My requirement is, I want to read this file and filter data based on columns(TIMESTAMP and RESPONSETIME) and check whether the data is numeric or not.
I have tried as below, but it did not work. Can some one assist me on the same ?
BufferedReader br = new BufferedReader(new FileReader(file));
rows = br.lines().map(line -> Arrays.asList(line.split(DELIMITER))).filter(a -> a.equals("TIMESTAMP")).collect(Collectors.toList());
Currently, after the map operation, you have a Stream<List<String>> and you're trying to compare that with a String, hence will never yield the expected outcome.
Now, to the solution; from what I can gather it seems that you want to retain the entire line if the TIMESTAMP and RESPONSETIME are valid integers.
One way to go about this is:
List<String> rows = br.lines()
.skip(1) // skip headers
.map(s -> new AbstractMap.SimpleEntry<>(s,s.split(DILIMETER)))
.filter(a -> isInteger(a.getValue()[0]) && isInteger(a.getValue()[2]))
.map(AbstractMap.SimpleEntry::getKey)
.collect(Collectors.toList());
and the isInteger function being defined as follows:
public static boolean isInteger(String input)
{
if(input == null || input.trim().isEmpty()) return false;
for (char c : input.toCharArray())
if (!Character.isDigit(c)) return false;
return true;
}
Another solution being is if you want to retrieve a List<String[]> where each array represents the individual data of each line then you can do:
List<String[]> rows = br.lines()
.skip(1) // skip headers
.map(s -> s.split(DILIMETER))
.filter(a -> isInteger(a[0]) && isInteger(a[2]))
.collect(Collectors.toList());
Note, if the file being read only contains the data with no headers then there is no need to perform the skip operation.
the problem is you will just get a list full of "TIMESTAMP" which isn't useful.
if the file format is always the same, meaning the order and number of headers, you can just skip the first line, then read each of the lines with the data and access just the columns with the data you want to validate. and it may be better to use a for or while loop so you can terminate early.
boolean allNumericData = true;
do{
String[] row = br.nextLine().split(DELIMITER)
if(!isNumeric(row[0])||!isNumeric(row[2])){
allNumericData = false;
}
}while(allNumericData ||br.nextRow = null)
if the headers can be different
then open the file read the first row to determine the index of the data required to validate, and do the same as above but with the found index.
also this is pseudo code. you will need to do the validation and handling of fetching and null checking the next row
Presently this is what you doing :
read all lines that give : List<String>
You split it so that give you String[] and you convert it to List<String> (that is correct) but you are in a map so the result of the map is Stream<List<String>>
You filter and a is a List. You try to compare a List and a String.
Bip bip bop bop, problem!
Like #YCF_L said, try it without lambda..
You can as well use flatMap and later on filter only string containing digits:
List<String> timeAndResponse = br.lines()
.flatMap(s -> Arrays.stream(s.split(",")))
.filter(s -> s.chars().allMatch(Character::isDigit))
.collect(Collectors.toList());
In this case you are working with streams only:
.flatMap(s -> Arrays.stream(s.split(","))) we take an individual line from the file, split it by , - take a stream from an intermediate array and finally call flatMap. This will give us Stream<String> where String is and individual string from original line 1544190995,US,500,Y. After this point let's just leave only numeric string using filter. Finally let's collect everything to a List, which will contain the following values:
[1544190995, 500, 1723922044, 370, 1711557214, 750]
I hope this helps.
I currently have a multiple layer structure data that is like this:
Industry class has a private field Set<Company> that can be null.
Company class has a private field Set<Division> that can be null.
Division class has a private field Set<Group> that can be null.
Group class has a private field groupName that can be null and is
retrievable with a getter (getGroupName()).
I am trying to stream an instance of Industry all way down to the Group layer and concatenate all the groupName's into one String with "/" in between.
If the this instance of Industry doesn't contain any groupName, return the string "null".
Based on my limited knowledge of Java 8, I am thinking of coding like this:
industry.stream()
.flatmap(industry -> industry.getCompanies().stream())
.filter(Objects::nonNull)
.flatmap(company -> company.getDivisions().stream())
.filter(Objects::nonNull)
.flatmap(division -> division.getGroups().stream())
.map(group -> group.getGroupName)
.collect(Collectors.joining("/")));
This code seems to flawed in someway. Also, I am not sure where to add the statement that if Industry cannot retrieve any groupName, rather than concatenate all groupName into one string simply return a string "null".
What is the proper way to use Java 8 stream in my situation?
Thanks.
Collectors.joining(…) is based on the class StringJoiner. It offers its delimiter, prefix, and suffix features, but unfortunately not the ability to provide the empty value.
To add that feature, we’ll have to re-implement Collectors.joining, which thankfully is not so hard when using StringJoiner.
Change the last line of your stream operation
.collect(Collectors.joining("/"));
to
.filter(Objects::nonNull) // elide all null elements
.collect(()->new StringJoiner("/", "", "").setEmptyValue("null"), // use "null" when empty
StringJoiner::add, StringJoiner::merge).toString();
I understood your question as pretty much anything can be null. In this case you could create your own function to deal with this. I made one as such:
/**
* Creates a stream function for the provided collection function which ignores all null values.
* Will filter out null values passed into the collection function and null values from the resulting stream
* #param collectionFn
* #param <T>
* #param <R>
* #return
*/
public static <T, R> Function<T, Stream<R>> nullSafeMapper(Function<T, Collection<R>> collectionFn) {
return (v) -> Optional.ofNullable(v)
.map(collectionFn)
.map(Collection::stream)
.orElse(Stream.empty())
.filter(Objects::nonNull);
}
Basically its completely null safe, filtering out anything which is null in the input and output. and could be used as such:
industries.stream()
.flatMap(SO46101593.nullSafeMapper(Industry::getCompanies))
.flatMap(SO46101593.nullSafeMapper(Company::getDivisions))
.flatMap(SO46101593.nullSafeMapper(Division::getGroups))
.map(group -> group.getGroupName())
.filter(Objects::nonNull) // filter out null group names
.collect(Collectors.joining("/"));
You could also take that logic and push it down directly into your expression but since it has to be repeated 3 times it gets a bit... verbose and repetitive
Here is an example with null checks:
String s = industries.stream()
.filter( i -> i.getCompanies() != null ).flatMap( i -> i.getCompanies().stream() )
.filter( c -> c != null && c.getDivisions() != null ).flatMap( c -> c.getDivisions().stream() )
.filter( d -> d != null && d.getGroups() != null ).flatMap( d -> d.getGroups().stream() )
.filter( g -> g != null && g.getGroupName() != null ).map( g -> g.getGroupName() )
.collect( Collectors.joining("/") );
You can replace Collectors.joining("/") with Holger's example.
This should do it:
Stream.of(industry)
.map(Industry::getCompanies).filter(Objects::nonNull)
.flatMap(Set::stream)
.map(Company::getDivisions).filter(Objects::nonNull)
.flatMap(Set::stream)
.map(Division::getGroups).filter(Objects::nonNull)
.flatMap(Set::stream)
.map(Group::getGroupName).filter(Objects::nonNull)
.collect(Collectors.collectingAndThen(Collectors.joining("/"),
names -> names.isEmpty() ? "null" : names));
I'm assuming industry.stream() is incorrect, since you say you're working from "an instance of Industry". Instead, I make a Stream<Industry> with one element.
You need to do the null checks on the sets before you try to call stream on them. You're checking whether stream returns null, which is too late.
The final transform from empty result to "null" falls under the concept of the "finisher" function in Collector. Collectors.joining doesn't let you specify a finisher directly, but you can use Collectors.collectingAndThen to add a finisher to any existing Collector.
I wasn't sure how exactly to frame this question, so bear with me...
1) Is there a better (aka more "proper") way to instantiate a Stream of optional elements, other than adding null and subsequently filtering out null's?
Stream.of( ... ,
person.likesRed() ? Color.RED : null)
.filter(Objects::nonNull)
...
2) Secondly, is there a way to "inline" the following orElseGet function into the parent Stream/map?
.map(p -> ofNullable(p.getFavouriteColours()).orElseGet(fallbackToDefaultFavouriteColours))
The full (contrived) example:
import static java.util.Optional.ofNullable;
public Response getFavouriteColours(final String personId) {
Person person = personService.findById(personId);
Supplier<List<String>> fallbackToDefaultFavouriteColours = () ->
Stream.of(
Color.BLUE,
Color.GREEN,
person.likesRed() ? Color.RED : null)
.filter(Objects::nonNull)
.map(Color::getName)
.collect(Collectors.toList());
return ofNullable(person)
.map(p -> ofNullable(p.getFavouriteColours()).orElseGet(fallbackToDefaultFavouriteColours))
.map(Response::createSuccess)
.orElse(Response::createNotFound);
}
A cleaner expression would be
Stream.concat(Stream.of(Color.BLUE, Color.GREEN),
person.likesRed()? Stream.of(Color.RED): Stream.empty())
This isn’t simpler than your original expression, but it doesn’t create the bad feeling of inserting something just to filter it out afterwards or, more abstract, of discarding an already known information that has to be reconstructed afterwards.
There is even a technical difference. The expression above creates a Stream that a has a known size that can be used to optimize certain operations. In contrast, the variant using filter only has an estimated size, which will be the number of elements before filtering, but not a known exact size.
The surrounding code can be greatly simplified by not overusing Optional:
public Response getFavouriteColours(final String personId) {
Person person = personService.findById(personId);
if(person == null) return Response.createNotFound();
List<String> favouriteColours = person.getFavouriteColours();
if(favouriteColours == null)
favouriteColours = Stream.concat(
Stream.of(Color.BLUE, Color.GREEN),
person.likesRed()? Stream.of(Color.RED): Stream.empty())
.map(Color::getName)
.collect(Collectors.toList());
return Response.createSuccess(favouriteColours);
}
Even the Stream operation itself is not simpler than a conventional imperative code here:
public Response getFavouriteColours(final String personId) {
Person person = personService.findById(personId);
if(person==null) return Response.createNotFound();
List<String> favouriteColours = person.getFavouriteColours();
if(favouriteColours==null) {
favouriteColours=new ArrayList<>();
Collections.addAll(favouriteColours, Color.BLUE.getName(), Color.GREEN.getName());
if(person.likesRed()) favouriteColours.add(Color.RED.getName());
}
return Response.createSuccess(favouriteColours);
}
though it’s likely that a more complex example would benefit from the Stream API use, whereas the use of Optional is unlikely to get better with more complex operations. A chain of Optional operations can simplify the code if all absent values or filter mismatches within the chain are supposed to be handled the same way at the end of the chain. If, however, like in your example (and most real life scenarios) every absent value should get a different treatment or be reported individually, using Optional, especially the nested use of Optionals, does not improve the code.