I have an "Item" class that contains the following fields (in short): id (related to the primary key of the Item table on SQL Server), description, sequence (non-null integer), and link (a reference to the id of the parent object), can be null)
I would like to sort by using Java as follows:
Id Sequence Link Description
1 1 null Item A
99 ..1 1 Son of A, first of the sequence
57 ..2 1 Son of A, second of the sequence
66 ..3 1 Son of A, third of the sequence
2 2 null Item B
3 3 null Item C
...
(I put the dots for better visualization)
That is, I would like the children of a certain item to come directly below their parent, ordered by the "sequence" field.
I tried using the comparator, but it failed:
public class SequenceComparator implements Comparator<Item> {
#Override
public int compare(Item o1, Item o2) {
String x1 = o1.getSequence().toString();
String x2 = o2.getSequence().toString();
int sComp = x1.compareTo(x2);
if (sComp != 0) {
return sComp;
} else {
x1 = o1.getLink().toString();
x2 = o2.getLink() == null?"":o2.getLink().toString();
return x1.compareTo(x2);
}
}
}
How can I do that?
New answer: I don’t think you want one comparator to control the complete sorting, because when sorting children you need the sequence of the parent, and you don’t have an easy or natural access to that from within the comparator.
Instead I suggest a sorting in a number of steps:
Put the items into groups by parent items. So one group will be the item with id 1 and all its children. Items with no children will be in a group on their own.
Sort each group so the parent comes first and then all the children in the right order.
Sort the groups by the parent’s sequence.
Concatenate the sorted groups into one list.
Like this, using both Java 8 streams and List.sort():
// group by parent id
Map<Integer, List<Item>> intermediate = input.stream()
.collect(Collectors.groupingBy(i -> i.getLink() == null ? Integer.valueOf(i.getId()) : i.getLink()));
// sort each inner list so that parent comes first and then children by sequence
for (List<Item> innerList : intermediate.values()) {
innerList.sort((i1, i2) -> {
if (i1.getLink() == null) { // i1 is parent
return -1; // parent first
}
if (i2.getLink() == null) {
return 1;
}
return i1.getSequence().compareTo(i2.getSequence());
});
}
// sort lists by parent’s sequence, that is, sequence of first item
List<Item> result = intermediate.values().stream()
.sorted(Comparator.comparing(innerList -> innerList.get(0).getSequence()))
.flatMap(List::stream)
.collect(Collectors.toList());
The output is (leaving out the item description):
1 1 null
99 ..1 1
57 ..2 1
66 ..3 1
2 2 null
3 3 null
(This output was produced with a toString method that printed the dots when converting an item with a parent to a String.)
If you cannot use Java 8, I still believe the general idea of the steps mentioned above will work, only some of the steps will require a little more code.
I deleted my previous answer since I had misunderstood the part about what getLink() returns and then decided that that answer wasn’t worth trying to salvage.
Edit:
I am actually ignoring this piece from the documentation of Collectors.groupingBy(): “There are no guarantees on the …, mutability, of the … List objects returned.” It still works with my Java 8. If immutability of the list should prevent sorting, the solution is to create a new ArrayList containing the same items.
With thanks to Stuart Marks for the inspiration, the comparator for sorting the inner lists needs not be as clumsy as above. The sorting can be written in this condensed way:
innerList.sort(Comparator.comparing(itm -> itm.getLink() == null ? null : itm.getSequence(),
Comparator.nullsFirst(Integer::compare)));
Given that there are only two layers in the hierarchy, this boils down to a classic multi-level sort. There are two kinds of items, parents and children, distinguished by whether the link field is null. The trick is that the sorting at each level isn't on a particular field. Instead, the value on which to sort on depends on what kind of item it is.
The first level of sorting should be on the parent value. The parent value of a parent item is its sequence, but the parent value of a child item is the sequence of the parent it's linked to. Child items are linked to parent items via their id, so the first thing we need to do is to build up a map from ids to sequence values of parent nodes:
Map<Integer, Integer> idSeqMap =
list.stream()
.filter(it -> it.getLink() == null)
.collect(Collectors.toMap(Item::getId, Item::getSequence));
(This assumes that ids are unique, which is reasonable as they're related to the table primary key.)
Now that we have this map, you can write a lambda expression that gets the appropriate parent value from the item. (This assumes that all non-null link values point to existing items.) This is as follows:
(Item it) -> it.getLink() == null ? it.getSequence() : idSeqMap.get(it.getLink())
The second level of sorting should be on the child value. The child value of a parent item is null, so nulls will need to be sorted before any non-null value. The child value of a child item is its sequence. A lambda expression for getting the child value is:
(Item it) -> it.getLink() == null ? null : it.getSequence()
Now, we can combine these using the Comparator helper functions introduced in Java 8. The result can be passed directly to the List.sort() method.
list.sort(Comparator.comparingInt((Item it) -> it.getLink() == null ? it.getSequence() : idSeqMap.get(it.getLink()))
.thenComparing((Item it) -> it.getLink() == null ? null : it.getSequence(),
Comparator.nullsFirst(Integer::compare))
.thenComparingInt(Item::getId));
The first level of sorting is pretty straightforward; just pass the first lambda expression (which extracts the parent value) to Comparator.comparingInt.
The second level of sorting is a bit tricky. I'm assuming the result of getLink() is a nullable Integer. First, we have to extract the child value using the second lambda expression. This results in a nullable value, so if we were to pass this to thenComparing we'd get a NullPointerException. Instead, thenComparing allows us to pass a secondary comparator. We'll use this to handle nulls. For this secondary comparator we pass
Comparator.nullsFirst(Integer::compare)
This compares Integer objects, with nulls sorted first, and non-nulls compared in turn using the Integer.compare method.
Finally, we compare id values as a last resort. This is optional if you're using this comparator only for sorting; duplicates will end up next to each other. But if you use this comparator for a TreeSet, you'll want to make sure that different items never compare equals. Presumably a database id value would be sufficient to differentiate all unique items.
Considering your data structure is a Tree (with null as the root node) with no cycles:
You have to walk up the tree for both o1 and o2 until you find a common ancestor. Once you do, take one step back along both branches to find their relative order (with Sequence)
Finding the common ancestor may be tricky to do, and I don't know if it is possible in linear time, but certainly possible in O(n log n) time (with n the length of the branches)
Related
I searched the site and didn't find something similar. I'm newbie to using the Java stream, but I understand that it's a replacement for a loop command. However, I would like to know if there is a way to filter a CSV file using stream, as shown below, where only the repeated records are included in the result and grouped by the Center field.
Initial CSV file
Final result
In addition, the same pair cannot appear in the final result inversely, as shown in the table below:
This shouldn't happen
Is there a way to do it using stream and grouping at the same time, since theoretically, two loops would be needed to perform the task?
Thanks in advance.
You can do it in one pass as a stream with O(n) efficiency:
class PersonKey {
// have a field for every column that is used to detect duplicates
String center, name, mother, birthdate;
public PersonKey(String line) {
// implement String constructor
}
// implement equals and hashCode using all fields
}
List<String> lines; // the input
Set<PersonKey> seen = new HashSet<>();
List<String> unique = lines.stream()
.filter(p -> !seen.add(new PersonKey(p))
.distinct()
.collect(toList());
The trick here is that a HashSet has constant time operations and its add() method returns false if the value being added is already in the set, true otherwise.
What I understood from your examples is you consider an entry as duplicate if all the attributes have same value except the ID. You can use anymatch for this:
list.stream().filter(x ->
list.stream().anyMatch(y -> isDuplicate(x, y))).collect(Collectors.toList())
So what does the isDuplicate(x,y) do?
This returns a boolean. You can check whether all the entries have same value except the id in this method:
private boolean isDuplicate(CsvEntry x, CsvEntry y) {
return !x.getId().equals(y.getId())
&& x.getName().equals(y.getName())
&& x.getMother().equals(y.getMother())
&& x.getBirth().equals(y.getBirth());
}
I've assumed you've taken all the entries as String. Change the checks according to the type. This will give you the duplicate entries with their corresponding ID
I'm new to java stream API.
I have 2 lists, and if both their internal object ID matches wants to put some attributes to MAP.
Below is the implementation.
List<LookupMstEntity> examTypeDetails; //This list contains values init.
List<MarksMstEntity> marksDetailList; //This list contains values init.
//FYI above entities have lombok setter, getter, equals & hashcode.
Map<Long, Integer> marksDetailMap = new HashMap<>();
//need below implementation to changed using java 8.
for (LookupMstEntity examType : examTypeDetails) {
for (MarksMstEntity marks : marksDetailList) {
if (examType.getLookupId() == marks.getExamTypeId())
marksDetailMap.put(examType.getLookupId(), marks.getMarks());
}
}
Creating a set of lookupIds Set<Long> ids helps you to throw away duplicate values and to get rid of unnecessary checks.
Then you can filter marksDetailList accordingly with examTypeId values:
filter(m -> ids.contains(m.getExamTypeId()))
HashSet contains() method has constant time complexity O(1).
Try this:
Set<Long> ids = examTypeDetails.stream().map(LookupMstEntity::getLookupId)
.collect(Collectors.toCollection(HashSet::new));
Map<Long, Integer> marksDetailMap = marksDetailList.stream().filter(m -> ids.contains(m.getExamTypeId()))
.collect(Collectors.toMap(MarksMstEntity::getExamTypeId, MarksMstEntity::getMarks));
As long as you are looking for these with equal ID, it doesn't matter which ID you use then. I suggest you to start streaming the marksDetailList first since you need its getMarks(). The filtering method searches if there is a match in IDs. If so, collect the required key-values to the map.
Map<Long, Integer> marksDetailMap = marksDetailList.stream() // List<MarksMstEntity>
.filter(mark -> examTypeDetails.stream() // filtered those where ...
.map(LookupMstEntity::getLookupId) // ... the lookupId
.anyMatch(id -> id == mark.getExamTypeId())) // ... is present in the list
.collect(Collectors.toMap( // collected to Map ...
MarksMstEntity::getExamTypeId, // ... with ID as a key
MarksMstEntity::getMarks)); // ... and marks as a value
The .map(..).anyMatch(..) can be shrink into one:
.anyMatch(exam -> exam.getLookupId() == mark.getExamTypeId())
As stated in the comments, I'd rather go for the for-each iteration as you have already used for sake of brevity.
An observation:
First, your resultant map indicates that there can only be one match for ID types (otherwise you would have duplicate keys and the value would need to be a List or some other way of merging duplicate keys, not an Integer. So when you find the first one and insert it in the map, break out of the inner loop.
for (LookupMstEntity examType : examTypeDetails) {
for (MarksMstEntity marks : marksDetailList) {
if (examType.getLookupId() == marks.getExamTypeId()) {
marksDetailMap.put(examType.getLookupId(),
marks.getMarks());
// no need to keep on searching for this ID
break;
}
}
}
Also if your two classes were related by a parent class or a shared interface that had access to to the id, and the two classes were considered equal based on that id, then you could do something similar to this.
for (LookupMstEntity examType : examTypeDetails) {
int index = marksDetailList.indexOf(examType);
if (index > 0) {
marksDetailMap.put(examType.getLookupId(),
marksDetaiList.get(index).getMarks());
}
}
Of course the burden of locating the index is still there but it is now under the hood and you are relieved of that responsibility.
You can do it with O(N) time complexity using HashMap, first convert two lists into Map<Integer, LookupMstEntity> and Map<Integer, MarksMstEntity> with id as key
Map<Integer, LookupMstEntity> examTypes = examTypeDetails.stream()
.collect(Collectors.toMap(LookupMstEntity::getLookupId,
Function.identity()) //make sure you don't have any duplicate LookupMstEntity objects with same id
Map<Integer, MarksMstEntity> marks = marksDetailList.stream()
.collect(Collectors.toMap(MarksMstEntity::getExamTypeId,
Function.identity()) // make sure there are no duplicates
And then stream the examTypes map and then collect into map if MarksMstEntity exists with same id in marks map
Map<Integer, Integer> result = examTypes.entrySet()
.stream()
.map(entry->new AbstractMap.SimpleEntry<Integer, MarksMstEntity>(entry.getKey(), marks.get(entry.getKey())))
.filter(entry->entry.getValue()!=null)
.collect(Collectors.toMap(Map.Entry::getKey, Map.Entry::getValue));
I am new to Java 8. I have a list of custom objects of type A, where A is like below:
class A {
int id;
String name;
}
I would like to determine if all the objects in that list have same name. I can do it by iterating over the list and capturing previous and current value of names. In that context, I found How to count number of custom objects in list which have same value for one of its attribute. But is there any better way to do the same in java 8 using stream?
You can map from A --> String , apply the distinct intermediate operation, utilise limit(2) to enable optimisation where possible and then check if count is less than or equal to 1 in which case all objects have the same name and if not then they do not all have the same name.
boolean result = myList.stream()
.map(A::getName)
.distinct()
.limit(2)
.count() <= 1;
With the example shown above, we leverage the limit(2) operation so that we stop as soon as we find two distinct object names.
One way is to get the name of the first list and call allMatch and check against that.
String firstName = yourListOfAs.get(0).name;
boolean allSameName = yourListOfAs.stream().allMatch(x -> x.name.equals(firstName));
another way is to calculate count of distinct names using
boolean result = myList.stream().map(A::getName).distinct().count() == 1;
of course you need to add getter for 'name' field
One more option by using Partitioning. Partitioning is a special kind of grouping, in which the resultant map contains at most two different groups – one for true and one for false.
by this, You can get number of matching and not matching
String firstName = yourListOfAs.get(0).name;
Map<Boolean, List<Employee>> partitioned = employees.stream().collect(partitioningBy(e -> e.name==firstName));
Java 9 using takeWhile takewhile will take all the values until the predicate returns false. this is similar to break statement in while loop
String firstName = yourListOfAs.get(0).name;
List<Employee> filterList = employees.stream()
.takeWhile(e->firstName.equals(e.name)).collect(Collectors.toList());
if(filterList.size()==list.size())
{
//all objects have same values
}
Or use groupingBy then check entrySet size.
boolean b = list.stream()
.collect(Collectors.groupingBy(A::getName,
Collectors.toList())).entrySet().size() == 1;
I have a list of following info
public class TheInfo {
private int id;
private String fieldOne;
private String fieldTwo;
private String fieldThree;
private String fieldFour;
//Standard Getters, Setters, Equals, Hashcode, ToString methods
}
The list is required to be processed in such a way that
Among duplicates, select the one with minimum ID, and remove others. In this particular case, entries are considered duplicate when their values of fieldOne and fieldTwo are equal.
Get concatenated value of fieldThree and fieldFour.
I want to process this list Java8 Streams. Currently I don't know how to remove duplicates base on custom fields. I think I can't use distinct() because I can't change equals/hashcode method as logic is just for this specific case.
How can I achieve this?
Assuming you have
List<TheInfo> list;
you can use
List<TheInfo> result = new ArrayList<>(list.stream().collect(
Collectors.groupingBy(info -> Arrays.asList(info.getFieldOne(), info.getFieldOne()),
Collectors.collectingAndThen(
Collectors.minBy(Comparator.comparingInt(TheInfo::getId)),
Optional::get))).values());
the groupingBy collector produces groups according to a function whose results determine the equality. A list already implements this for a sequence of values, so Arrays.asList(info.getFieldOne(), info.getFieldOne()) produces a suitable key. In Java 9, you would most probably use List.of(info.getFieldOne(), info.getFieldOne()) instead.
The second argument to groupingBy is another collector determining how to process the groups, Collectors.minBy(…) will fold them to the minimum element according to a comparator and Comparator.comparingInt(TheInfo::getId) is the right comparator for getting the element with the minimum id.
Unfortunately, the minBy collector produces an Optional that would be empty if there are no elements, but since we know that the groups can’t be empty (groups without elements wouldn’t be created in the first place), we can unconditionally call get on the optional to retrieve the actual value. This is what wrapping this collector in Collectors.collectingAndThen(…, Optional::get) does.
Now, the result of the grouping is a Map mapping from the keys created by the function to the TheInfo instance with the minimum id. Calling values() on the Map gives as a Collection<TheInfo> and since you want a List, a final new ArrayList<>(collection) will produce it.
Thinking about it, this might be one of the cases, where the toMap collector is simpler to use, especially as the merging of the group elements doesn’t benefit from mutable reduction:
List<TheInfo> result = new ArrayList<>(list.stream().collect(
Collectors.toMap(
info -> Arrays.asList(info.getFieldOne(), info.getFieldOne()),
Function.identity(),
BinaryOperator.minBy(Comparator.comparingInt(TheInfo::getId)))).values());
This uses the same function for determining the key and another function determining a single value, which is just an identity function and a reduction function that will be called, if a group has more than one element. This will again be a function returning the minimum according to the ID comparator.
Using streams, you can process it using just the collector, if you provide it with proper classifier:
private static <T> T min(T first, T second, Comparator<? super T> cmp) {
return cmp.compare(first, second) <= 0 ? first : second;
}
private static void process(Collection<TheInfo> data) {
Comparator<TheInfo> cmp = Comparator.comparing(info -> info.id);
data.stream()
.collect(Collectors.toMap(
info -> Arrays.asList(info.fieldOne, info.fieldTwo), // Your classifier uses a tuple. Closest thing in JDK currently would be a list or some custom class. I chose List for brevity.
info -> info, // or Function.identity()
(a, b) -> min(a, b, cmp) // what do we do with duplicates. Currently we take min according to Comparator.
));
}
The above stream will be collected into Map<List<String>, TheInfo>, which will contain minimal element with lists of two strings as key. You can extract the map.values() and return then in new collection or whatever you need them for.
I want to iterate nested lists using java8 streams, and extract some results of the lists on first match.
Unfortunately I have to also get a values from the parent content if a child element matches the filter.
How could I do this?
java7
Result result = new Result();
//find first match and pupulate the result object.
for (FirstNode first : response.getFirstNodes()) {
for (SndNode snd : first.getSndNodes()) {
if (snd.isValid()) {
result.setKey(first.getKey());
result.setContent(snd.getContent());
return;
}
}
}
java8
response.getFirstNodes().stream()
.flatMap(first -> first.getSndNodes())
.filter(snd -> snd.isValid())
.findFirst()
.ifPresent(???); //cannot access snd.getContent() here
When you need both values and want to use flatMap (as required when you want to perform a short-circuit operation like findFirst), you have to map to an object holding both values
response.getFirstNodes().stream()
.flatMap(first->first.getSndNodes().stream()
.map(snd->new AbstractMap.SimpleImmutableEntry<>(first, snd)))
.filter(e->e.getValue().isValid())
.findFirst().ifPresent(e-> {
result.setKey(e.getKey().getKey());
result.setContent(e.getValue().getContent());
});
In order to use standard classes only, I use a Map.Entry as Pair type whereas a real Pair type might look more concise.
In this specific use case, you can move the filter operation to the inner stream
response.getFirstNodes().stream()
.flatMap(first->first.getSndNodes().stream()
.filter(snd->snd.isValid())
.map(snd->new AbstractMap.SimpleImmutableEntry<>(first, snd)))
.findFirst().ifPresent(e-> {
result.setKey(e.getKey().getKey());
result.setContent(e.getValue().getContent());
});
which has the neat effect that only for the one matching item, a Map.Entry instance will be created (well, should as the current implementation is not as lazy as it should but even then it will still create lesser objects than with the first variant).
It should be like this:
Edit: Thanks Holger for pointing out that the code won't stop at the first valid FirstNode
response.getFirstNodes().stream()
.filter(it -> {it.getSndNodes().stream().filter(SndNode::isValid).findFirst(); return true;})
.findFirst()
.ifPresent(first -> first.getSndNodes().stream().filter(SndNode::isValid).findFirst().ifPresent(snd -> {
result.setKey(first.getKey());
result.setContent(snd.getContent());
}));
A test can be found here