Replace for loop with parallelStream - Java - java

I have the following method that calls itself recursively:
public ArrayList<SpecTreeNode> getLeavesBelow()
{
ArrayList<SpecTreeNode> result = new ArrayList<>();
if (isLeaf())
{
result.add(this);
}
for (SpecTreeNode stn : chList)
{
result.addAll(stn.getLeavesBelow());
}
return result;
}
I'd like to convert the for loop to use parallelStream. I think I'm partly there but not sure how to implement .collect() to 'addAll' to result:
chList.parallelStream()
.map(SpecTreeNode::getLeavesBelow)
.collect();
Some assistance would be much appreciated.

Just like this, right? Am I missing something?
result.addAll(
chList.parallelStream()
.map(SpecTreeNode::getLeavesBelow)
.flatMap(Collection::stream)
.collect(Collectors.toList())
);
Unrelated to your question but because you're seeking performance improvements: you may see some gains by specifying an initial size for your ArrayList to avoid reallocating multiple times.
A LinkedList may be a preferable data structure if you can't anticipate the size, as all you're doing here is continually appending to the end of the list. However, if you need to randomly access elements of this list later then it might not be.

I would do it by making the recursive method return a Stream of nodes instead of a List, then filter to keep only the leaves and finally collect to a list:
public List<SpecTreeNode> getLeavesBelow() {
return nodesBelow(this)
.parallel()
.filter(Node::isLeaf)
.collect(Collectors.toList());
}
private Stream<SpecTreeNode> nodesBelow(SpecTreeNode node) {
return Stream.concat(
Stream.of(node),
node.chList.stream()
.flatMap(this::leavesBelow));
}

Related

filtering a stream against items in another list

trying to filter a stream against data within a different list:
It works, but I use a for loop in the middle of the stream. I cannot find any information of how to convert the for loop to a stream.
I could just .stream() the selction.getItems() than .forEach() and have a new .stream() of DATA.accounts, but that is poor code as it would have to restream on every .forEach.
y=1;
DATA.accounts.stream()
.flatMap(estimate -> estimate.getElements().stream())
.filter( ele-> {
// different list;
for (Element element:selection.getItems()){
if (element.getId()==ele.getId()){
return true;
}
}
return false;
})
.forEach(element -> {
element.setDateSchedualed(selectedDate);
element.setOrder(y);
y++;
});
I think what you really need is:
list1.removeAll(list2);
No streams involved though.
You can express the filter as
.filter(ele -> selection.getItems().stream()
.anyMatch(element -> element.getId()==ele.getId())
The fact that this “would have to restream” shouldn’t bother you more than the fact that the original code will loop for every element. You have created an operation with O(n×m) time complexity in either case. This is acceptable if you can surely predict that one of these lists will always be very small.
Otherwise, there is no way around preparing this operation by storing the id values in a structure with a fast (O(1) in the best case) lookup. I.e.
Set<IdType> id = selection.getItems().stream()
.map(element -> element.getId())
.collect(Collectors.toSet());
…
.filter(ele -> id.contains(ele.getId())
Besides that, your forEach approach incrementing the y variable clearly is an anti-pattern and it doesn’t even compile, when y is a local variable. And if y is a field, it would make this code even worse. Here, it’s much cleaner to accept a temporary storage into a List:
Set<IdType> id = selection.getItems().stream().map(element -> element.getId());
List<ElementType> list = DATA.accounts.stream()
.flatMap(estimate -> estimate.getElements().stream())
.filter(ele -> id.contains(ele.getId())
.collect(Collectors.toList());
IntStream.range(0, list.size())
.forEach(ix -> {
ElementType element = list.get(ix);
element.setDateSchedualed(selectedDate);
element.setOrder(ix+1);
});
Put the other list's IDs in a Set selectedIds, then filter based on ele-> selectedIds.contains(ele.getId()).
That will give you (amortized) linear time complexity.
Since you need to check presence among all elements in selected for each item in the stream, I don't expect there will be any straightforward method using only streams (because you cannot really stream the selected collection for this task).
I think there is actually nothing wrong with using a for-each loop if you want to search for the id in linear time, because for example if your items list was an ArrayList and you used its contains method for filtering, it would actually also just loop over the elements. You could write a general contains function like:
public static <E1, E2> boolean contains(Collection<E1> collection, E2 e2, BiPredicate<E1, E2> predicate){
for (E1 e1 : collection){
if (predicate.test(e1, e2)){
return true;
}
}
return false;
}
and replace your for-each loop with it:
ele -> contains(selection.getItems(), ele, (e1, e2) -> e1.getId() == e2.getId())

Java stream - verify at least one element in a list contains in another

I have a Map. Let's say
Map<Long, Train>
Each Train has a list
List<Integer> parts = train.getTrainParts()
I have another list
List<Integer> blueParts;
I want to iterate the map and collect all trains that have at least one blue part.
This is a naive usage of Streams :
trainMap().values().stream().filter(part -> {
boolean found = false;
for (Long part : train.getTrainParts()) {
if (blueParts.conatins(part)) {
found = true;
}
}
return found;
).collect(Collectors.toList());
What are better options ?
Stream or not stream?
e.g.
tagDataContainer.getDeliveryGroupMap().values().stream().filter(dg -> {
Sets.SetView<Long> intersection = Sets.intersection(Sets.newHashSet(dg.getPlacements()), Sets.newHashSet(placementsToChangeStatusToPublish));
return intersection.size()>0;
}
);
You can simplify the filter :
List<Train> =
trainMap().values()
.stream()
.filter(t-> t.getTrainParts().stream().anyMatch(p->blueParts.contains(p)))
.collect(Collectors.toList());
And if you can change blueParts to be a HashSet instead of a List, your code would run faster, since blueParts.contains() would require constant time instead of linear time.
based on your question: "I want to iterate the map and collect all trains that have at least one blue part." i think you are looking for trains that have atleast one element in bluepart list, below is something you could do.
List<Train> trainList = trainMap.values().stream().filter(t -> t.getBlueParts().size()>0).collect(Collectors.toList());

Modifying Objects within stream in Java8 while iterating

In Java8 streams, am I allowed to modify/update objects within?
For eg. List<User> users:
users.stream().forEach(u -> u.setProperty("value"))
Yes, you can modify state of objects inside your stream, but most often you should avoid modifying state of source of stream. From non-interference section of stream package documentation we can read that:
For most data sources, preventing interference means ensuring that the data source is not modified at all during the execution of the stream pipeline. The notable exception to this are streams whose sources are concurrent collections, which are specifically designed to handle concurrent modification. Concurrent stream sources are those whose Spliterator reports the CONCURRENT characteristic.
So this is OK
List<User> users = getUsers();
users.stream().forEach(u -> u.setProperty(value));
// ^ ^^^^^^^^^^^^^
// \__/
but this in most cases is not
users.stream().forEach(u -> users.remove(u));
//^^^^^ ^^^^^^^^^^^^
// \_____________________/
and may throw ConcurrentModificationException or even other unexpected exceptions like NPE:
List<Integer> list = IntStream.range(0, 10).boxed().collect(Collectors.toList());
list.stream()
.filter(i -> i > 5)
.forEach(i -> list.remove(i)); //throws NullPointerException
The functional way would imho be:
import static java.util.stream.Collectors.toList;
import java.util.Arrays;
import java.util.List;
import java.util.function.Predicate;
public class PredicateTestRun {
public static void main(String[] args) {
List<String> lines = Arrays.asList("a", "b", "c");
System.out.println(lines); // [a, b, c]
Predicate<? super String> predicate = value -> "b".equals(value);
lines = lines.stream().filter(predicate.negate()).collect(toList());
System.out.println(lines); // [a, c]
}
}
In this solution the original list is not modified, but should contain your expected result in a new list that is accessible under the same variable as the old one
To do structural modification on the source of the stream, as Pshemo mentioned in his answer, one solution is to create a new instance of a Collection like ArrayList with the items inside your primary list; iterate over the new list, and do the operations on the primary list.
new ArrayList<>(users).stream().forEach(u -> users.remove(u));
You can make use of the removeIf to remove data from a list conditionally.
Eg:- If you want to remove all even numbers from a list, you can do it as follows.
final List<Integer> list = IntStream.range(1,100).boxed().collect(Collectors.toList());
list.removeIf(number -> number % 2 == 0);
To get rid from ConcurrentModificationException Use CopyOnWriteArrayList
Instead of creating strange things, you can just filter() and then map() your result.
This is much more readable and sure. Streams will make it in only one loop.
As it was mentioned before - you can't modify original list, but you can stream, modify and collect items into new list. Here is simple example how to modify string element.
public class StreamTest {
#Test
public void replaceInsideStream() {
List<String> list = Arrays.asList("test1", "test2_attr", "test3");
List<String> output = list.stream().map(value -> value.replace("_attr", "")).collect(Collectors.toList());
System.out.println("Output: " + output); // Output: [test1, test2, test3]
}
}
.peek() is the answer.
users.stream().peek(u -> u.setProperty("value")).foreach(i->{
...
...
});
for new list
users.stream().peek(u -> u.setProperty("value")).collect(Collectors.toList());
This might be a little late. But here is one of the usage. This to get the count of the number of files.
Create a pointer to memory (a new obj in this case) and have the property of the object modified. Java 8 stream doesn't allow to modify the pointer itself and hence if you declare just count as a variable and try to increment within the stream it will never work and throw a compiler exception in the first place
Path path = Paths.get("/Users/XXXX/static/test.txt");
Count c = new Count();
c.setCount(0);
Files.lines(path).forEach(item -> {
c.setCount(c.getCount()+1);
System.out.println(item);});
System.out.println("line count,"+c);
public static class Count{
private int count;
public int getCount() {
return count;
}
public void setCount(int count) {
this.count = count;
}
#Override
public String toString() {
return "Count [count=" + count + "]";
}
}
Yes, you can modify or update the values of objects in the list in your case likewise:
users.stream().forEach(u -> u.setProperty("some_value"))
However, the above statement will make updates on the source objects. Which may not be acceptable in most cases.
Luckily, we do have another way like:
List<Users> updatedUsers = users.stream().map(u -> u.setProperty("some_value")).collect(Collectors.toList());
Which returns an updated list back, without hampering the old one.

Get size of an Iterable in Java

I need to figure out the number of elements in an Iterable in Java.
I know I can do this:
Iterable values = ...
it = values.iterator();
while (it.hasNext()) {
it.next();
sum++;
}
I could also do something like this, because I do not need the objects in the Iterable any further:
it = values.iterator();
while (it.hasNext()) {
it.remove();
sum++;
}
A small scale benchmark did not show much performance difference, any comments or other ideas for this problem?
TL;DR: Use the utility method Iterables.size(Iterable) of the great Guava library.
Of your two code snippets, you should use the first one, because the second one will remove all elements from values, so it is empty afterwards. Changing a data structure for a simple query like its size is very unexpected.
For performance, this depends on your data structure. If it is for example in fact an ArrayList, removing elements from the beginning (what your second method is doing) is very slow (calculating the size becomes O(n*n) instead of O(n) as it should be).
In general, if there is the chance that values is actually a Collection and not only an Iterable, check this and call size() in case:
if (values instanceof Collection<?>) {
return ((Collection<?>)values).size();
}
// use Iterator here...
The call to size() will usually be much faster than counting the number of elements, and this trick is exactly what Iterables.size(Iterable) of Guava does for you.
If you are working with java 8 you may use:
Iterable values = ...
long size = values.spliterator().getExactSizeIfKnown();
it will only work if the iterable source has a determined size. Most Spliterators for Collections will, but you may have issues if it comes from a HashSetor ResultSetfor instance.
You can check the javadoc here.
If Java 8 is not an option, or if you don't know where the iterable comes from, you can use the same approach as guava:
if (iterable instanceof Collection) {
return ((Collection<?>) iterable).size();
} else {
int count = 0;
Iterator iterator = iterable.iterator();
while(iterator.hasNext()) {
iterator.next();
count++;
}
return count;
}
This is perhaps a bit late, but may help someone. I come across similar issue with Iterable in my codebase and solution was to use for each without explicitly calling values.iterator();.
int size = 0;
for(T value : values) {
size++;
}
You can cast your iterable to a list then use .size() on it.
Lists.newArrayList(iterable).size();
For the sake of clarity, the above method will require the following import:
import com.google.common.collect.Lists;
Strictly speaking, Iterable does not have size. Think data structure like a cycle.
And think about following Iterable instance, No size:
new Iterable(){
#Override public Iterator iterator() {
return new Iterator(){
#Override
public boolean hasNext() {
return isExternalSystemAvailble();
}
#Override
public Object next() {
return fetchDataFromExternalSystem();
}};
}};
java 8 and above
StreamSupport.stream(data.spliterator(), false).count();
I would go for it.next() for the simple reason that next() is guaranteed to be implemented, while remove() is an optional operation.
E next()
Returns the next element in the iteration.
void remove()
Removes from the underlying collection the last element returned by the iterator (optional operation).
As for me, these are just different methods. The first one leaves the object you're iterating on unchanged, while the seconds leaves it empty.
The question is what do you want to do.
The complexity of removing is based on implementation of your iterable object.
If you're using Collections - just obtain the size like was proposed by Kazekage Gaara - its usually the best approach performance wise.
Why don't you simply use the size() method on your Collection to get the number of elements?
Iterator is just meant to iterate,nothing else.
Instead of using loops and counting each element or using and third party library we can simply typecast the iterable in ArrayList and get its size.
((ArrayList) iterable).size();

Remove items of type from List

I keep coming across techniques like the code below where i need to filter out a type of enumeration from a list.
Are there any more efficent ways to do this?
private List<TestResult> removeInfo(List<TestResult> testResults) {
List<TestResult> tmpT = new ArrayList<TestResult>();
for(TestResult t : testResults) {
if(!t.getSeverity().equals(Severity.INFO)) {
tmpT.add(t);
}
}
return tmpT;
}
Collections would be my first thought but not sure.
Cheers
D
You can filter the list in-place using an iterator:
Iterator<TestResult> iterator = testResults.iterator();
while(iterator.hasNext()) {
TestResult result = iterator.next();
if (!result.getSeverity().equals(Severity.INFO)) {
iterator.remove();
}
}
//Now testResults contains all elements that do not have Severity of INFO
This is an example of Marvo's comment. It's more memory efficient but just time efficient as your code. Again, as Marvo said you could some improvement by using a LinkedList. However, unless you're dealing with huge lists of data I don't think you'll notice a difference.
Google's Guava has some efficient ways to this:
private Iterable<TestResults> removeInfo(List<TestResults> list){
return Iterables.filter(list, new Predicate<TestResults>(){
public boolean apply(TestResults input){
return !input.getSeverity.equals(Severity.INFO);
}
}
}
This does not copy the data at all, and still works if the List's Iterator does not support remove.

Categories