How to iterate two lists simultaneously using Java 8 - java

I below two lists
List<Map<String, Strings>> mapList
List<MyObject> myObjectList
Both lists have same size.
Currently I am iterating them using for loop as below.
List <CustomObject> customObjectList1 = new ArrayList();
List <CustomObject> customObjectList2 = new ArrayList();
int i=0;
for(MyObject myObject:myObjectList){
if(“NEW”.equalIgnoreCase(myObject.getType)){
customObjectList1.add(constructCustomObject(myObject, mapList.get(i));
}
if(“DELETE”.equalIgnoreCase(myObject.getType)){
customObjectList2.add(constructCustomObject(myObject, mapList.get(i));
}
i++;
}
if(!customObjectList1.isEmpty()){
jpaRepo.saveAll(customObjectList1);
}
if(!customObjectList2.isEmpty()){
jpaRepo.deleteAll(customObjectList2);
}
Any better/efficient way to iterate two lists simultaneously using Java 8?

Seems like your issue centers in the ability of knowing the index of the object you are iterating on.
If you want to do it the stream way, maybe you can try something like this
Disclaimer: I do not think it will have a very big impact on performance since there are no objects that can be released by the GC, or decrease in number of iterations.
IntStream.range(0, myObjectList.size())
.forEach(idx -> {
MyObject myObject = myObjectList.get(idx);
if(“NEW”.equalIgnoreCase(myObject.getType)){
customObjectList1.add(constructCustomObject(myObject, mapList.get(idx));
}
if(“DELETE”.equalIgnoreCase(myObject.getType)){
customObjectList2.add(constructCustomObject(myObject, mapList.get(idx));
}
});
;

Related

What is the most computationally efficient way to flatmap a List of Lists?

There is a problem on my server where it became a bottle neck due to a specific problem to solve resolving a List<List<SomeObject>> into a List<SomeObject>. The CPU of the server spiked above normal means.
DataStructure is:
Object:
List<SomeObject> childList;
Trying to make a List<Object> flatmapped to List<SomeObject> in the most computationally efficient way.
If parentList = List<Object>:
I Tried:
parentList.stream().flatMap(child -> child.getChildList().stream()).collect(Collectors.toList())
Also tried:
List<Object> all = new ArrayList<>();
parentList.forEach(child -> all.addAll(child.getChildList()))
Any other suggestions? These seem to be similar in computation but pretty high due to copying underneath the hood.
This may be more efficient since it eliminates creating multiple streams via flatMap. MapMulti was introduced in Java 16. It takes the streamed argument and a consumer which puts something on the stream, in this case each list's object.
List<List<Object>> lists = new ArrayList<>(
List.of(List.of("1", "2", "3"),
List.of("4", "5", "6", "7"),
List.of("8", "9")));
List<Object> list = lists.stream().mapMulti(
(lst, consumer) -> lst.forEach(consumer))
.toList();
System.out.print(list);
prints
[1, 2, 3, 4, 5, 6, 7, 8, 9]
Do we know more about which List implementation is used?
I would try to init the resulting list with the correct expected size.
This avoids unnecessary copying.
This assumes that the size of the lists can be retrieved fast.
int expectedSize = parentList.stream()
.mapToInt(entry -> entry.getChildList().size())
.sum();
List<SomeObject> result = new ArrayList<>(expectedSize);
for (var entry : parentList) {
result.addAll(entry.getChildList());
}
In java 8
List<Object> listOne = new ArrayList<>();
List<Object> listTwo = new ArrayList<>();
List<Object> listThree = new ArrayList<>();
...
Stream.of(...) concatenate many lists
List<Object> newList = Stream.of(listOne,listTwo,listThree).flatMap(Collection::stream).collect(Collectors.toList());
In Java 16+
List<Object> newList=Stream.concat(Stream.concat(listOne, listTwo), listThree).toList();
Being an ETL (“Extract Transform and Load”) process, Streams processes collections of data using multiple threads of execution at each stage of processing.
One way to make the flat mapping more computationally efficient is to use a for loop instead of the stream API or forEach method. The for loop would iterate over the parent list, and for each element, it would add the child list to the flat list. This avoids the overhead of creating streams and using the collect method. Additionally, using an ArrayList to store the flat list instead of a LinkedList can also improve performance since it has a more efficient implementation of the addAll method.
List<SomeObject> flatList = new ArrayList<>();
for (Object o : parentList) {
flatList.addAll(o.getChildList());
Another way would be to use an iterator. Iterator is an interface for traversing a collection and it's more efficient than forEach or for loop.
List<SomeObject> flatList - new ArrayList<>();
Iterator<Object> iterator = parentList.iterator();
while(iterator.hasNext()){
Object o = iterator.next():
flatList.addAll(o.getChildList()):
}
You could also use the concat method for List, which concatenates two lists in an efficient way and results in a new list.
List<SomeObject> flatList = new ArrayList<>()
for (Object o : parentList){
flatList.concat(o.getChildList());
}
THERE ARE SERVERAL RESOURCES THAT YOU CAN USE FOR ADDITIONAL READING ON THIS TOPIC. HERE ARE A FEW THAT I WOULD RECOMMEND.
https://docs.oracle.com/en/java/javase/14/docs/api/java.base/java/util/List.html
https://docs.oracle.com/en/java/javase/14/docs/api/java.base/java/util/ArrayList.html
https://docs.oracle.com/en/java/javase/14/docs/api/java.base/java/util/Iterator.html
https://www.oreilly.com/library/view/java-performance-the/9781449358652/
https://www.tutorialspoint.com/java_data_structure_algorithms/index.htm

java 8 parallelStream().forEach Result data loss

There are two test cases which use parallelStream():
List<Integer> src = new ArrayList<>();
for (int i = 0; i < 20000; i++) {
src.add(i);
}
List<String> strings = new ArrayList<>();
src.parallelStream().filter(integer -> (integer % 2) == 0).forEach(integer -> strings.add(integer + ""));
System.out.println("=size=>" + strings.size());
=size=>9332
List<Integer> src = new ArrayList<>();
for (int i = 0; i < 20000; i++) {
src.add(i);
}
List<String> strings = new ArrayList<>();
src.parallelStream().forEach(integer -> strings.add(integer + ""));
System.out.println("=size=>" + strings.size());
=size=>17908
Why do I always lose data when using parallelStream?
What did i do wrong?
ArrayList isn't thread safe. You need to do
List<String> strings = Collections.synchronizedList(new ArrayList<>());
or
List<String> strings = new Vector<>();
to ensure all updates are synchronized, or switch to
List<String> strings = src.parallelStream()
.filter(integer -> (integer % 2) == 0)
.map(integer -> integer + "")
.collect(Collectors.toList());
and leave the list building to the Streams framework. Note that it's undefined whether the list returned by collect is modifiable, so if that is a requirement, you may need to modify your approach.
In terms of performance, Stream.collect is likely to be much faster than using Stream.forEach to add to a synchronized collection, since the Streams framework can handle collection of values in each thread separately without synchronization and combine the results at the end in a thread safe fashion.
ArrayList isn't thread-safe. While 1 thread sees a list with 30 elements another might still see 29 and override the 30th position (loosing 1 element).
Another issue might arise when the array backing the list needs to be resized. A new array (with double the size) is created and elements from the original array are copied into it. While other threads might have added stuff the thread doing the resizing might not have seen this or multiple threads are resizing and eventually only 1 will win.
When using multiple threads you need to either do some syncronized when accessing the list OR use a multi-thread safe list (by either wrapping it in a SynchronizedList or by using a CopyOnWriteArrayList to mention 2 possible solutions). Even better would be to use the collect method on the stream to put everything into a list.
ParallelStream with forEach is a deadly combo if not used carefully.
Please take a look at below points to avoid any bugs:
If you have a preexisting list object in which you want to add more objects from a parallelStream loop, Use Collections.synchronizedList & pass that pre-existing list object to it before looping through the parallelstream.
If you have to create a new list, then you can use Vector to initialize the list outside the loop.
or
If you have to create a new list, then simply use parallelStream and collect the output at the end.
You lose the benefits of using stream (and parallel stream) when you try to do mutation. As a general rule, avoid mutation when using streams. Venkat Subramaniam explains why. Instead, use collectors. Also try to get a lot accomplished within the stream chain. For example:
System.out.println(
IntStream.range(0, 200000)
.filter(i -> i % 2 == 0)
.mapToObj(String::valueOf)
.collect(Collectors.toList()).size()
);
You can run that in parallelStream by adding .parallel()

Removing values in an arraylist that DO NOT match a value

I am having some trouble with removing values that do not match a given value. At the moment I am copying over values to a new list and trying to clear the original list - but this is inefficient.
This is my code:
int size = list.size();
ArrayList<String> newList;
int count = 0;
newList = new ArrayList<>();
for (int i=0; i<list.size(); i++){
if(list.get(i).getForename().equals(forename)){
newList.add(i, list);
}
}
list.clear();
Is there a way where I can just remove an item in the arraylist if it does NOT match the name?
EDIT:
It works but then I might need a copy, as if I select a another name from the dropdown it will be referring to the old one
Thanks
A first thought would be to iterate on the list and as soon as you find an item not matching the value, you remove it. But it will create a Concurrent modification exception, as you iterate on list while trying to remove elements in it.
An other, still not efficient would be to iterate on the list, keep track of the indexes to remove, and after iterating on the list, remove them.
ArrayList<Integer> indexList = new ArrayList<Integer>();
for(int i = 0; i<list.size(); i++){
if(!list.get(i).getForename().equals(forename)){
indexList.add(i);
}
for(Integer index : indexList){
list.remove(index);
}
indexList.clear();
Please not that this is not really efficient too, but maybe you were looking for a way to delete from the same list.
A simple solution is
while (list.contains(value)) {
list.remove(list.indexOf(value));
}
Depending on what you want, you might want to use streams instead (seems to be what you actually want, since you don't really seem to want to delete elements in your list):
newList = list.stream()
.filter(e -> getForename().equals(forename))
.collect(Collectors.toList());
or to perform your action what you might want to do:
list.stream()
.filter(e -> getForename().equals(forename))
.forEach(person -> doStuff(person));
Another way would be using iterators to avoid conflicts with modifications during iteration:
ListIterator iterator = list.listIterator();
while(iterator.hasNext()){
if(!iterator.getNext().getForename().equals(forename))
iterator.remove();
}
EDIT: Since OP can't use lambdas and streams (because of Java-version), here is what nearly happens for the second stream (the forEach). I am not using the proper interfaces, since OP can't do so either. The difference to streams is, that they also might split this into several threads and hence would be faster (especially on multi-core processors and big lists):
interface Consumer<T>{ //this is normally given by the JAVA 8 API (which has one more default method)
void accept(T t);
}
Consumer<YourObject> doIt = new Consumer<YourObject>(){ //This is what the lambda expression actually does
#Override
public void accept(YourObject e) {
doStuff(e);
}
};
for(YourObject element : list){ //since JAVA 1.5. Alternativ your old for-loop with element=list.get(i);
if(!element.getForename().equals(forename)) //the filter written in easy
continue;
doIt.accept(element); //You could also use a method or expressions instead in this context.
//doStuff(element); //What actually the upper stream does.
}
You might want to look at the oracle tutorial (this chapter) to get a feeling, when this design is appropriate https://docs.oracle.com/javase/tutorial/java/javaOO/lambdaexpressions.html (I have a strong feeling, you might want to use it).
Assuming your List contains String objects the following should be what you are looking for:
for (Iterator<String> it = list.iterator(); it.hasNext()){
String foreName = it.next();
if(forName != null && foreName.equals(forename)){
it.remove();
}
}
try
for (int i=0; i<list.size();){
if(!list.get(i).getForename().equals(forename)){
list.remove(i);
}
else {
i++;
}
}

Adding elements in Non-synchronized ArrayList using java parallel stream

I want to run this code in parallel using java parallel stream and update result in two ArrayList. The code given below is working fine except that the non-thread-safety of ArrayList may cause incorrect results, and I don't want to synchronize the ArrayList. Can someone please suggest me a proper way of using parallel stream for my case.
List<Integer> passedList= new ArrayList<>();
List<Integer> failedList= new ArrayList<>();
Integer[] input = {0,1,2,3,4,5,6,7,8,9};
List<Integer> myList = Arrays.asList(input);
myList.parallelStream().forEach(element -> {
if (isSuccess(element)) {//Some SOAP API call.
passedList.add(element);
} else {
failedList.add(element);
}
});
System.out.println(passedList);
System.out.println(failedList);
An appropriate solution would be to use Collectors.partitioningBy:
Integer[] input = {0,1,2,3,4,5,6,7,8,9};
List<Integer> myList = Arrays.asList(input);
Map<Boolean, List<Integer>> map = myList.parallelStream()
.collect(Collectors.partitioningBy(element -> isSuccess(element)));
List<Integer> passedList = map.get(true);
List<Integer> failedList = map.get(false);
This way you will have no thread-safety problems as the task will be decomposed in map-reduce manner: the parts of the input will be processed independently and joined after that. If your isSuccess method is slow you will likely to have performance boost here.
By the way you can create a parallel stream from the original array using Arrays.stream(input).parallel() without necessity to create an intermediate myList.

Less verbose way to remove objects from the same class in an array of multiple classes

Suppose I have an array with the following elements:
List<Object> objects = new ArrayList<>();
objects.add(1);
objects.add("one");
objects.add("two");
objects.add(new Object());
objects.add(2);
Is there a reduced way to remove certain objects of the same category?
For example, if I want to remove only the strings, I know I can do something like this:
for (Iterator<Object> it = objects.iterator(); it.hasNext();){
if(it.next() instanceof String) {
it.remove();
}
}
But is this the minimal way to do it? I guess I can do it with java-8 but I'm not too sure. Thanks!
In Java 8 you can use Collection.removeIf():
objects.removeIf(obj -> obj instanceof String);
It's still O(n), but it's a little more readable.

Categories