I was using a stream based approach to map-reduce my List<Map<String,String>> to a List<CustomObject> . The following code was used for the stream
List<Map<String,String>> mailVariable = (List<Map<String, String>>) processVariables.get("MAIL_MAP");
1| List<CustomObject> detList = mailVariable
2| .stream()
3| .flatMap(getEntry)
4| .filter (isEmpty)
5| .reduce(new ArrayList<CustomObject>(),accumulateToCustomObject,combiner);
I was analyzing my code using sonarLint and got the following error on line 2 and 3
Refactor this code so that stream pipeline is used. squid:S3958
I am infact using stream and returing the value from the terminal operation as suggested here. Is there anything I'm doing wrong ?. Could any one suggest the correct way to write this code ?
// following are the functional interface impls used in the process
Function<Map<String,String>, Stream<Entry<String,String>>> getEntry = data -> data.entrySet().stream();
Predicate<Entry<String, String>> isEmpty = data -> data.getValue() != null
|| !data.getValue().isEmpty()
|| !data.getValue().equals(" ");
BinaryOperator<ArrayList<CustomObject>> combiner = (a, b) -> {
ArrayList<CustomObject> acc = b;
acc.addAll(a);
return acc;
};
BiFunction<ArrayList<CustomObject>,Entry<String,String>,ArrayList<CustomObject>> accumulateToCustomObject = (finalList, eachset) -> {
/* reduction process happens
building the CustomObject..
*/
return finalList;
};
Update:: I have found a workaround to this problem by splitting my map-reduce operation into a map and collect operation like so. That particular lint error is showing up now
List<AlertEventLogDetTO> detList = mailVariable
.stream()
.flatMap(getEntry)
.filter (isEmpty)
.map(mapToObj)
.filter(Objects::nonNull)
.collect(Collectors.toList());
Function<Entry<String,String>,AlertEventLogDetTO> mapToObj = eachSet -> {
String tagString = null;
String tagValue = eachSet.getValue();
try{
tagString = MapVariables.valueOf(eachSet.getKey()).getTag();
} catch(Exception e){
tagString = eachSet.getKey();
}
if(eventTags.contains(tagString)){
AlertEventLogDetTO entity = new AlertEventLogDetTO();
entity.setAeldAelId(alertEventLog.getAelId());
entity.setAelTag(tagString);
entity.setAelValue(tagValue);
return entity;
}
return null;
};
Related
I am pretty new to java8 streams. I was trying to work on collection of objects using stream. But not able to achieve in precise way.
Below is the snippet which I achieved (which is giving wrong result). expected end result is List<String> of "Names email#test.com".
recordObjects is collection of object
choices = recordObjects.stream()
.filter(record -> record.getAttribute
(OneRecord.AT_RECORD_SUBMITTER_TABLE_EMAIL) != null)
.filter(record -> !record.getAttributeAsString
(OneRecord.AT_RECORD_SUBMITTER_TABLE_EMAIL).isEmpty())
.map(record -> record.getMultiValuedAttribute
(OneRecord.AT_RECORD_SUBMITTER_TABLE_EMAIL, String.class))
.flatMap(Collection::stream)
.map(email -> getFormattedEmailAddress(ATTRI_AND_RECORD_CONTACT_DEFAULT_NAME, email))
.collect(Collectors.toList());
but below is the exact logic i want to implement using streams.
for (CallerObject record : recordObjects) {
List<String> emails = record.getMultiValuedAttribute(
OneRecord.AT_RECORD_SUBMITTER_TABLE_EMAIL, String.class);
List<String> names = record.getMultiValuedAttribute(
OneRecord.AT_RECORD_SUBMITTER_TABLE_NAME, String.class);
int N = emails.size();
for (int i = 0 ; i < N ; i++) {
if(!isNullOrEmpty(emails.get(i)))
{
choices.add(getFormattedEmailAddress(isNullOrEmpty(names.get(i)) ?
ATTRI_AND_RECORD_CONTACT_DEFAULT_NAME : names.get(i) , emails.get(i)));
}
}
}
Since we don't know the getFormattedEmailAddress method, I used String.format instead to achieve the desired representation "Names email#test.com":
// the mapper function: using String.format
Function<RecordObject, String> toEmailString = r -> {
String email = record.getMultiValuedAttribute(OneRecord.AT_RECORD_SUBMITTER_TABLE_EMAIL, String.class);
String name = record.getMultiValuedAttribute(OneRecord.AT_RECORD_SUBMITTER_TABLE_NAME, String.class);
if (email != null) {
return String.format("%s %s", name, email);
} else {
return null;
}
};
choices = recordObjects.stream()
.map(toEmailString) // map to email-format or null
.filter(Objects::nonNull) // exclude null strings where no email was found
.collect(Collectors.toList());
Changed your older version code to Java 8
final Function<RecordedObject, List<String>> filteredEmail = ro -> {
final List<String> emails = ro.getMultiValuedAttribute(
OneRecord.AT_RECORD_SUBMITTER_TABLE_EMAIL, String.class);
final List<String> names = ro.getMultiValuedAttribute(
OneRecord.AT_RECORD_SUBMITTER_TABLE_NAME, String.class);
return IntStream.range(0, emails.size())
.filter(index -> !isNullOrEmpty(emails.get(index)))
.map(index -> getFormattedEmailAddress(isNullOrEmpty(names.get(index)) ?
ATTRI_AND_RECORD_CONTACT_DEFAULT_NAME : names.get(index) , emails.get(index)))
.collect(Collectors.toList());
};
recordObjects
.stream()
.map(filteredEmail)
.flatMap(Collection::stream)
.collect(Collectors.toList());
I have a use-case in Java where I need to populate one of the lists (say x) based on the id of the other list(say y) and to fetch the result from that list.
List<LightRecruiterScholarResponse> responses = eventScholarRepository.findScholarDetailsByEventId(eventId);
List<InterviewDto> interviewResults = interviewRepository.getInterviewResultByRoundIdAndScholarId();
for (LightRecruiterScholarResponse response : responses) {
String val = null;
for (InterviewDto dto : interviewResults) {
if (dto.getId().equals(response.getScholarId())) {
val = dto.getInterviewResult();
break;
}
}
response.setInterviewStatus(val);
}
You can use:
Map<String, String> map = interviewResults.stream()
.collect(Collectors.toMap(InterviewDto::getId, InterviewDto::getInterviewResult));
responses.forEach(response ->
response.setInterviewStatus(
map.getOrDefault(response.getScholarId(), null)));
The idea is to create a Map of Id for key and InterviewResult for value, and then for each element in responses you set InterviewStatus which you can find it in the map by ScholarId which can replace if (dto.getId().equals(response.getScholarId()))
This could be done straightforward:
List<LightRecruiterScholarResponse> responses =
eventScholarRepository.findScholarDetailsByEventId(eventId);
List<InterviewDto> interviewResults =
interviewRepository.getInterviewResultByRoundIdAndScholarId();
responses.forEach(response -> response.setInterviewStatus(
interviewResults.stream()
.filter(dto -> dto.getId().equals(response.getScholarId()))
.map(InterviewDto::getInterviewResult)
.findFirst().orElse(null)));
This is not very efficient, because you iterate over interviewResults for every response. To fix it, you can build Map and use it:
List<LightRecruiterScholarResponse> responses =
eventScholarRepository.findScholarDetailsByEventId(eventId);
Map<String, String> interviewResults =
interviewRepository.getInterviewResultByRoundIdAndScholarId().stream()
.collect(Collectors.toMap(InterviewDto::getId,
InterviewDto::getInterviewResult));
responses.forEach(response ->
response.setInterviewStatus(interviewResults.get(response.getScholarId())));
I have a product, I wanna populate products in another array with the same original order, I used parallel Stream and the result was not ordered with the original order
List<Product> products = productList.getProducts();
List<ProductModelDTOV2> productModelDTOV2s = new ArrayList<>();
products.parallelStream().forEach(p -> {
try {
ProductModelDTOV2 ProductModelDTOV2 = dtoFactoryV2.populate(p, summary);
productModelDTOV2s.add(ProductModelDTOV2);
} catch (GenericException e) {
log.debug(String.format("Unable to populate Product %s", p));
}
});
return productModelDTOV2s;
It seems like this part of the code can be unordered and be run in parallel:
ProductModelDTOV2 ProductModelDTOV2 = dtoFactoryV2.populate(p, summary);
But this part must be ordered:
productModelDTOV2s.add(ProductModelDTOV2);
What you can do is to separate those two things. Do the first part in a flatMap, and the second part in forEachOrdered:
products.parallelStream().flatMap(o -> { // this block will be done in parallel
try {
return Stream.of(dtoFactoryV2.populate(p, summary));
} catch (GenericException e) {
// don't expect this message to be printed in order
log.debug(String.format("Unable to populate Product %s", p));
return Stream.of();
}
})
.forEachOrdered(productModelDTOV2s::add); // this will be done in order, non-parallel
The correct way to do this, would be to have the Stream create the list:
List<Product> products = productList.getProducts();
return products.parallelStream()
.map(p -> {
try {
return dtoFactoryV2.populate(p, summary);
} catch (GenericException e) {
log.debug("Unable to populate Product " + p);
return null;
}
})
.filter(Objects::nonNull)
.collect(Collectors.toList());
I have string like this.
val input = "perm1|0,perm2|2,perm2|1"
Desired output type is
val output: Set<String, Set<Long>>
and desired output value is
{perm1 [], perm2 [1,2] }
Here I need empty set if value is 0. I am using groupByTo like this
val output = input.split(",")
.map { it.split("|") }
.groupByTo(
mutableMapOf(),
keySelector = { it[0] },
valueTransform = { it[1].toLong() }
)
However the output structure is like this
MutableMap<String, MutableList<Long>>
and output is
{perm1 [0], perm2 [1,2] }
I am looking for best way to get desired output without using imperative style like this.
val output = mutableMapOf<String, Set<Long>>()
input.split(",").forEach {
val t = it.split("|")
if (t[1].contentEquals("0")) {
output[t[0]] = mutableSetOf()
}
if (output.containsKey(t[0]) && !t[1].contentEquals("0")) {
output[t[0]] = output[t[0]]!! + t[1].toLong()
}
if (!output.containsKey(t[0]) && !t[1].contentEquals("0")) {
output[t[0]] = mutableSetOf()
output[t[0]] = output[t[0]]!! + t[1].toLong()
}
}
You can simply use mapValues to convert values type from List<Long> to Set<Long>
var res : Map<String, Set<Long>> = input.split(",")
.map { it.split("|") }
.groupBy( {it[0]}, {it[1].toLong()} )
.mapValues { it.value.toSet() }
And of you want to replace list of 0 with empty set you can do it using if-expression
var res : Map<String, Set<Long>> = input.split(",")
.map { it.split("|") }
.groupBy( {it[0]}, {it[1].toLong()} )
.mapValues { if(it.value == listOf<Long>(0)) setOf() else it.value.toSet() }
Note that you cannot have Set with key-value pair, result will be of type map. Below code gives sorted set in the values.
val result = "perm1|0,perm2|2,perm2|1".split(",")
.map {
val split = it.split("|")
split[0] to split[1].toLong()
}.groupBy({ it.first }, { it.second })
.mapValues { it.value.toSortedSet() }
While the other answer(s) might be easier to grasp, they build immediate lists and maps in between, that are basically discarded right after the next operation. The following tries to omit that using splitToSequence (Sequences) and groupingBy (see Grouping bottom part):
val result: Map<String, Set<Long>> = input.splitToSequence(',')
.map { it.split('|', limit = 2) }
.groupingBy { it[0] }
.fold({ _, _ -> mutableSetOf<Long>() }) { _, accumulator, element ->
accumulator.also {
it.add(element[1].toLong()))
}
}
You can of course also filter out the addition of 0 in the set with a simple condition in the fold-step:
// alternative fold skipping 0-values, but keeping keys
.fold({ _, _ -> mutableSetOf<Long>() }) { _, accumulator, element ->
accumulator.also {
val value = element[1].toLong()
if (value != 0L)
it.add(value)
}
}
Alternatively also aggregating might be ok, but then your result-variable needs to change to Map<String, MutableSet<Long>>:
val result: Map<String, MutableSet<Long>> = // ...
.aggregate { _, accumulator, element, first ->
(if (first) mutableSetOf<Long>() else accumulator!!).also {
val value = element[1].toLong()
if (value != 0L)
it.add(value)
}
}
I'm trying to merge two streams and one of them should be stateful (like static data with not frequent updates):
SparkConf conf = new SparkConf().setAppName("Test Application").setMaster("local[*]");
JavaStreamingContext context = new JavaStreamingContext(conf, Durations.seconds(10));
context.checkpoint(".");
JavaDStream<String> dataStream = context.socketTextStream("localhost", 9998);
JavaDStream<String> refDataStream = context.socketTextStream("localhost", 9999);
JavaPairDStream<String, String> pairDataStream = dataStream.mapToPair(e -> {
String[] tmp = e.split(" ");
return new Tuple2<>(tmp[0], tmp[1]);
});
JavaPairDStream<String, String> pairRefDataStream = refDataStream.mapToPair(e -> {
String[] tmp = e.split(" ");
return new Tuple2<>(tmp[0], tmp[1]);
}).updateStateByKey((Function2<List<String>, Optional<String>, Optional<String>>) (strings, stringOptional) -> {
if (!strings.isEmpty()) {
return Optional.of(strings.get(0));
}
return Optional.absent();
});
pairDataStream.join(pairRefDataStream).print();
context.start();
context.awaitTermination();
When I write 1 aaa into the first stream and 1 111 into the second immediately everything works fine, I see result of the merge. But, when I write 1 bbb into the first stream after one minute I see nothing.
Do I understand correctly what updateStateByKey() does? Or I am wrong?
updateStateByKey does exactly what you ask it for. In particular if current window contains no data (strings.isEmpty()) you instruct it to forget (return Optional.absent();):
if (!strings.isEmpty()) {
return Optional.of(strings.get(0));
}
return Optional.absent();
while what you probably want is to return previous state:
if (!strings.isEmpty()) {
return Optional.of(strings.get(0));
}
return stringOptional;