Count the same items in a row in Java 8 Stream API - java

I have a bean and a stream
public class TokenBag {
private String token;
private int count;
// Standard constructor and getters here
}
Stream<String> src = Stream.of("a", "a", "a", "b", "b", "a", "a");
and want to apply some intermediate operation to the stream that returns another stream of objects of TokenBag. In this example there must be two: ("a", 3), ("b", 3) and ("a", 2).
Please think it as a very simplistic example. In real there will be much more complicated logic than just counting the same values in a row. Actually I try to design a simple parser that accepts a stream of tokens and returns a stream of objects.
Also please note that it must stay a stream (with no intermediate accumulation), and also in this example it must really count the same values in a row (it differs from grouping).
Will appreciate your suggestions about the general approach to this task solution.

Map<String, Long> result = src.collect(Collectors.groupingBy(Function.identity(), Collectors.counting()));
System.out.println(result);
This will give the desired output
a=4, b=3
You can then go ahead and iterate over map and create objects of TokenBag.

Stream<String> src = Stream.of("a", "a", "a", "a", "b", "b", "b");
// collect to map
Map<String, Long> counted = src
.collect(Collectors.groupingBy(Function.identity(), Collectors.counting()));
// collect to list
List<TokenBag> tokenBags = counted.entrySet().stream().map(m -> new TokenBag(m.getKey(), m.getValue().intValue()))
.collect(Collectors.toList());

First group it to a Map and then map the entries to a TokenBag:
Map<String, Long> values = src.collect(Collectors.groupingBy(Function.identity(), Collectors.counting()));
List<TokenBag> tokenBags = values.entrySet().stream().map(entry -> {
TokenBag tb = new TokenBag();
tb.setToken(entry.getKey());
tb.setCount(entry.getValue().intValue());
return tb;
}).collect(Collectors.toList());

You'd need to convert your stream to a Spliterator and then adapt this spliterator to a custom one that partially-reduces some elements according to your logic (in your example it would need to count equal elements until a different element appears). Then, you'd need to turn your spliterator back to a new stream.
Bear in mind that this can't be 100% lazy, as you'd need to eagerly consume some elements from the backing stream in order to create a new TokenBag element for the new stream.
Here's the code for the custom spliterator:
public class CountingSpliterator
extends Spliterators.AbstractSpliterator<TokenBag>
implements Consumer<String> {
private final Spliterator<String> source;
private String currentToken;
private String previousToken;
private int tokenCount = 0;
private boolean tokenHasChanged;
public CountingSpliterator(Spliterator<String> source) {
super(source.estimateSize(), source.characteristics());
this.source = source;
}
#Override
public boolean tryAdvance(Consumer<? super TokenBag> action) {
while (source.tryAdvance(this)) {
if (tokenHasChanged) {
action.accept(new TokenBag(previousToken, tokenCount));
tokenCount = 1;
return true;
}
}
if (tokenCount > 0) {
action.accept(new TokenBag(currentToken, tokenCount));
tokenCount = 0;
return true;
}
return false;
}
#Override
public void accept(String newToken) {
if (currentToken != null) {
previousToken = currentToken;
}
currentToken = newToken;
if (previousToken != null && !previousToken.equals(currentToken)) {
tokenHasChanged = true;
} else {
tokenCount++;
tokenHasChanged = false;
}
}
}
So this spliterator extends Spliterators.AbstractSpliterator and also implements Consumer. The code is quite complex, but the idea is that it adapts one or more tokens from the source spliterator into an instance of TokenBag.
For every accepted token from the source spliterator, the count for that token is incremented, until the token changes. At this point, a TokenBag instance is created with the token and the count and is immediately pushed to the Consumer<? super TokenBag> action parameter. Also, the counter is reset to 1. The logic in the accept method handles token changes, border cases, etc.
Here's how you should use this spliterator:
Stream<String> src = Stream.of("a", "a", "a", "b", "b", "a", "a");
Stream<TokenBag> stream = StreamSupport.stream(
new CountingSpliterator(src.spliterator()),
false); // false means sequential, we don't want parallel!
stream.forEach(System.out::println);
If you override toString() in TokenBag, the output is:
TokenBag{token='a', count=3}
TokenBag{token='b', count=2}
TokenBag{token='a', count=2}
A note on parallelism: I don't know how to parallelize this partial-reduce task, I even don't know if it's at all possible. But if it were, I doubt it would produce any measurable improvement.

Create a map and then collect the map into the list:
Stream<String> src = Stream.of("a", "a", "a", "a", "b", "b", "b");
Map<String, Long> m = src.collect(Collectors.groupingBy(Function.identity(), Collectors.counting()));
m.entrySet().stream().map(e -> new TokenBag(e.getKey(), e.getValue().intValue())).collect(Collectors.toList());

Related

How to avoid multiple Streams with Java 8

I am having the below code
trainResponse.getIds().stream()
.filter(id -> id.getType().equalsIgnoreCase("Company"))
.findFirst()
.ifPresent(id -> {
domainResp.setId(id.getId());
});
trainResponse.getIds().stream()
.filter(id -> id.getType().equalsIgnoreCase("Private"))
.findFirst()
.ifPresent(id ->
domainResp.setPrivateId(id.getId())
);
Here I'm iterating/streaming the list of Id objects 2 times.
The only difference between the two streams is in the filter() operation.
How to achieve it in single iteration, and what is the best approach (in terms of time and space complexity) to do this?
You can achieve that with Stream IPA in one pass though the given set of data and without increasing memory consumption (i.e. the result will contain only ids having required attributes).
For that, you can create a custom Collector that will expect as its parameters a Collection attributes to look for and a Function responsible for extracting the attribute from the stream element.
That's how this generic collector could be implemented.
/** *
* #param <T> - the type of stream elements
* #param <F> - the type of the key (a field of the stream element)
*/
class CollectByKey<T, F> implements Collector<T, Map<F, T>, Map<F, T>> {
private final Set<F> keys;
private final Function<T, F> keyExtractor;
public CollectByKey(Collection<F> keys, Function<T, F> keyExtractor) {
this.keys = new HashSet<>(keys);
this.keyExtractor = keyExtractor;
}
#Override
public Supplier<Map<F, T>> supplier() {
return HashMap::new;
}
#Override
public BiConsumer<Map<F, T>, T> accumulator() {
return this::tryAdd;
}
private void tryAdd(Map<F, T> map, T item) {
F key = keyExtractor.apply(item);
if (keys.remove(key)) {
map.put(key, item);
}
}
#Override
public BinaryOperator<Map<F, T>> combiner() {
return this::tryCombine;
}
private Map<F, T> tryCombine(Map<F, T> left, Map<F, T> right) {
right.forEach(left::putIfAbsent);
return left;
}
#Override
public Function<Map<F, T>, Map<F, T>> finisher() {
return Function.identity();
}
#Override
public Set<Characteristics> characteristics() {
return Collections.emptySet();
}
}
main() - demo (dummy Id class is not shown)
public class CustomCollectorByGivenAttributes {
public static void main(String[] args) {
List<Id> ids = List.of(new Id(1, "Company"), new Id(2, "Fizz"),
new Id(3, "Private"), new Id(4, "Buzz"));
Map<String, Id> idByType = ids.stream()
.collect(new CollectByKey<>(List.of("Company", "Private"), Id::getType));
idByType.forEach((k, v) -> {
if (k.equalsIgnoreCase("Company")) domainResp.setId(v);
if (k.equalsIgnoreCase("Private")) domainResp.setPrivateId(v);
});
System.out.println(idByType.keySet()); // printing keys - added for demo purposes
}
}
Output
[Company, Private]
Note, after the set of keys becomes empty (i.e. all resulting data has been fetched) the further elements of the stream will get ignored, but still all remained data is required to be processed.
IMO, the two streams solution is the most readable. And it may even be the most efficient solution using streams.
IMO, the best way to avoid multiple streams is to use a classical loop. For example:
// There may be bugs ...
boolean seenCompany = false;
boolean seenPrivate = false;
for (Id id: getIds()) {
if (!seenCompany && id.getType().equalsIgnoreCase("Company")) {
domainResp.setId(id.getId());
seenCompany = true;
} else if (!seenPrivate && id.getType().equalsIgnoreCase("Private")) {
domainResp.setPrivateId(id.getId());
seenPrivate = true;
}
if (seenCompany && seenPrivate) {
break;
}
}
It is unclear whether that is more efficient to performing one iteration or two iterations. It will depend on the class returned by getIds() and the code of iteration.
The complicated stuff with two flags is how you replicate the short circuiting behavior of findFirst() in your 2 stream solution. I don't know if it is possible to do that at all using one stream. If you can, it will involve something pretty cunning code.
But as you can see your original solution with 2 stream is clearly easier to understand than the above.
The main point of using streams is to make your code simpler. It is not about efficiency. When you try to do complicated things to make the streams more efficient, you are probably defeating the (true) purpose of using streams in the first place.
For your list of ids, you could just use a map, then assign them after retrieving, if present.
Map<String, Integer> seen = new HashMap<>();
for (Id id : ids) {
if (seen.size() == 2) {
break;
}
seen.computeIfAbsent(id.getType().toLowerCase(), v->id.getId());
}
If you want to test it, you can use the following:
record Id(String getType, int getId) {
#Override
public String toString() {
return String.format("[%s,%s]", getType, getId);
}
}
Random r = new Random();
List<Id> ids = r.ints(20, 1, 100)
.mapToObj(id -> new Id(
r.nextBoolean() ? "Company" : "Private", id))
.toList();
Edited to allow only certain types to be checked
If you have more than two types but only want to check on certain ones, you can do it as follows.
the process is the same except you have a Set of allowed types.
You simply check to see that your are processing one of those types by using contains.
Map<String, Integer> seen = new HashMap<>();
Set<String> allowedTypes = Set.of("company", "private");
for (Id id : ids) {
String type = id.getType();
if (allowedTypes.contains(type.toLowerCase())) {
if (seen.size() == allowedTypes.size()) {
break;
}
seen.computeIfAbsent(type,
v -> id.getId());
}
}
Testing is similar except that additional types need to be included.
create a list of some types that could be present.
and build a list of them as before.
notice that the size of allowed types replaces the value 2 to permit more than two types to be checked before exiting the loop.
List<String> possibleTypes =
List.of("Company", "Type1", "Private", "Type2");
Random r = new Random();
List<Id> ids =
r.ints(30, 1, 100)
.mapToObj(id -> new Id(possibleTypes.get(
r.nextInt((possibleTypes.size()))),
id))
.toList();
You can group by type and check the resulting map.
I suppose the type of ids is IdType.
Map<String, List<IdType>> map = trainResponse.getIds()
.stream()
.collect(Collectors.groupingBy(
id -> id.getType().toLowerCase()));
Optional.ofNullable(map.get("company")).ifPresent(ids -> domainResp.setId(ids.get(0).getId()));
Optional.ofNullable(map.get("private")).ifPresent(ids -> domainResp.setPrivateId(ids.get(0).getId()));
I'd recommend a traditionnal for loop. In addition of being easily scalable, this prevents you from traversing the collection multiple times.
Your code looks like something that'll be generalised in the future, thus my generic approch.
Here's some pseudo code (with errors, just for the sake of illustration)
Set<String> matches = new TreeSet<>(String.CASE_INSENSITIVE_ORDER);
for(id : trainResponse.getIds()) {
if (! matches.add(id.getType())) {
continue;
}
switch (id.getType().toLowerCase()) {
case "company":
domainResp.setId(id.getId());
break;
case "private":
...
}
}
Something along these lines can might work, it would go through the whole stream though, and won't stop at the first occurrence.
But assuming a small stream and only one Id for each type, why not?
Map<String, Consumer<String>> setters = new HashMap<>();
setters.put("Company", domainResp::setId);
setters.put("Private", domainResp::setPrivateId);
trainResponse.getIds().forEach(id -> {
if (setters.containsKey(id.getType())) {
setters.get(id.getType()).accept(id.getId());
}
});
We can use the Collectors.filtering from Java 9 onwards to collect the values based on condition.
For this scenario, I have changed code like below
final Map<String, String> results = trainResponse.getIds()
.stream()
.collect(Collectors.filtering(
id -> id.getType().equals("Company") || id.getIdContext().equals("Private"),
Collectors.toMap(Id::getType, Id::getId, (first, second) -> first)));
And getting the id from results Map.

Split a flux into two fluxes - head and tail

I want to split a flux into two fluxes where the first one has the first item of the original flux and the second one will takes the rest of items.
After applying a custom transformation myLogic on each flux I want to combine them into one flux preserving the order of the original flux.
Example:
S: student
S': student after applying myLogic
Emitted flux: s1 -> s2 -> s3 -> s4
The first splited flux: s1' => myLogic
The second splited flux: s2' -> s3' -> s4' => myLogic
The combined flux: s1' -> s2' -> s3' -> s4'
It is enough to use standard Flux methods take and skip to seprate head and tail elements. Calling cache before that is also useful to avoid subscription duplication.
class Util {
static <T, V> Flux<V> dualTransform(
Flux<T> originalFlux,
int cutpointIndex,
Function<T, V> transformHead,
Function<T, V> transformTail
) {
var cached = originalFlux.cache();
var head = cached.take(cutpointIndex).map(transformHead);
var tail = cached.skip(cutpointIndex).map(transformTail);
return Flux.concat(head, tail);
}
static void test() {
var sample = Flux.just("a", "b", "c", "d");
var result = dualTransform(
sample,
1,
x -> "{" + x.toUpperCase() + "}",
x -> "(" + x + ")"
);
result.doOnNext(System.out::print).subscribe();
// prints: {A}(b)(c)(d)
}
}
There's a more simple solution to your problem. You don't need to split and merge the events from publisher. You can make use of index(). It keeps information about the order in which events are published.
Flux<String> values = Flux.just("s1", "s2", "s3");
values.index((i, v) -> {
if (i == 0) {
return v.toUpperCase();
} else {
return v.toLowerCase();
}
});
Here's a hacky way to do this:
boolean a[] = new boolean[]{false}; //use an array as you cannot use non-final variables inside lambdas
originalFlux
.flatMap(a -> {
if(!a[0]) {
a[0] = true;
return runLogicForFirst(a);
} else {
return runLogicForRest(a);
}
})
Instead of creating two separate Flux objects and then merging them, you can just zip your original Flux with another Flux<Boolean> that's only ever true on the first element.
You can then do your processing conditionally as you please in a normal map() call without having to merge separate publishers later on:
Flux<String> values = Flux.just("A", "B", "C", "D", "E", "F", "G");
Flux.zip(Flux.concat(Flux.just(true), Flux.just(false).repeat()), values)
.map(x -> x.getT1() ? "_"+x.getT2().toUpperCase()+"_" : x.getT2().toLowerCase())
.subscribe(System.out::print); // prints "_A_bcdefg"

Converting array iteration to lambda function using Java8

I am trying to convert to Lambda function
So far I am able to convert the above code to lambda function like as shown below
Stream.of(acceptedDetails, rejectedDetails)
.filter(list -> !isNull(list) && list.length > 0)
.forEach(new Consumer<Object>() {
public void accept(Object acceptedOrRejected) {
String id;
if(acceptedOrRejected instanceof EmployeeValidationAccepted) {
id = ((EmployeeValidationAccepted) acceptedOrRejected).getId();
} else {
id = ((EmployeeValidationRejected) acceptedOrRejected).getAd().getId();
}
if(acceptedOrRejected instanceof EmployeeValidationAccepted) {
dates1.add(new Integer(id.split("something")[1]));
Integer empId = Integer.valueOf(id.split("something")[2]);
empIds1.add(empId);
} else {
dates2.add(new Integer(id.split("something")[1]));
Integer empId = Integer.valueOf(id.split("something")[2]);
empIds2.add(empId);
}
}
});
But still my goal was to avoid repeating the same logic and also to convert to Lambda function, still in my converted lambda function I feel its not clean and efficient.
This is just for my learning aspect I am doing this stuff by taking one existing code snippet.
Can anyone please tell me how can I improvise the converted Lambda function
Generally, when you try to refactor code, you should only focus on the necessary changes.
Just because you’re going to use the Stream API, there is no reason to clutter the code with checks for null or empty arrays which weren’t in the loop based code. Neither should you change BigInteger to Integer.
Then, you have two different inputs and want to get distinct results from each of them, in other words, you have two entirely different operations. While it is reasonable to consider sharing common code between them, once you identified identical code, there is no sense in trying to express two entirely different operations as a single operation.
First, let’s see how we would do this for a traditional loop:
static void addToLists(String id, List<Integer> empIdList, List<BigInteger> dateList) {
String[] array = id.split("-");
dateList.add(new BigInteger(array[1]));
empIdList.add(Integer.valueOf(array[2]));
}
List<Integer> empIdAccepted = new ArrayList<>();
List<BigInteger> dateAccepted = new ArrayList<>();
for(EmployeeValidationAccepted acceptedDetail : acceptedDetails) {
addToLists(acceptedDetail.getId(), empIdAccepted, dateAccepted);
}
List<Integer> empIdRejected = new ArrayList<>();
List<BigInteger> dateRejected = new ArrayList<>();
for(EmployeeValidationRejected rejectedDetail : rejectedDetails) {
addToLists(rejectedDetail.getAd().getId(), empIdRejected, dateRejected);
}
If we want to express the same as Stream operations, there’s the obstacle of having two results per operation. It truly took until JDK 12 to get a built-in solution:
static Collector<String,?,Map.Entry<List<Integer>,List<BigInteger>>> idAndDate() {
return Collectors.mapping(s -> s.split("-"),
Collectors.teeing(
Collectors.mapping(a -> Integer.valueOf(a[2]), Collectors.toList()),
Collectors.mapping(a -> new BigInteger(a[1]), Collectors.toList()),
Map::entry));
}
Map.Entry<List<Integer>, List<BigInteger>> e;
e = Arrays.stream(acceptedDetails)
.map(EmployeeValidationAccepted::getId)
.collect(idAndDate());
List<Integer> empIdAccepted = e.getKey();
List<BigInteger> dateAccepted = e.getValue();
e = Arrays.stream(rejectedDetails)
.map(r -> r.getAd().getId())
.collect(idAndDate());
List<Integer> empIdRejected = e.getKey();
List<BigInteger> dateRejected = e.getValue();
Since a method can’t return two values, this uses a Map.Entry to hold them.
To use this solution with Java versions before JDK 12, you can use the implementation posted at the end of this answer. You’d also have to replace Map::entry with AbstractMap.SimpleImmutableEntry::new then.
Or you use a custom collector written for this specific operation:
static Collector<String,?,Map.Entry<List<Integer>,List<BigInteger>>> idAndDate() {
return Collector.of(
() -> new AbstractMap.SimpleImmutableEntry<>(new ArrayList<>(), new ArrayList<>()),
(e,id) -> {
String[] array = id.split("-");
e.getValue().add(new BigInteger(array[1]));
e.getKey().add(Integer.valueOf(array[2]));
},
(e1, e2) -> {
e1.getKey().addAll(e2.getKey());
e1.getValue().addAll(e2.getValue());
return e1;
});
}
In other words, using the Stream API does not always make the code simpler.
As a final note, we don’t need to use the Stream API to utilize lambda expressions. We can also use them to move the loop into the common code.
static <T> void addToLists(T[] elements, Function<T,String> tToId,
List<Integer> empIdList, List<BigInteger> dateList) {
for(T t: elements) {
String[] array = tToId.apply(t).split("-");
dateList.add(new BigInteger(array[1]));
empIdList.add(Integer.valueOf(array[2]));
}
}
List<Integer> empIdAccepted = new ArrayList<>();
List<BigInteger> dateAccepted = new ArrayList<>();
addToLists(acceptedDetails, EmployeeValidationAccepted::getId, empIdAccepted, dateAccepted);
List<Integer> empIdRejected = new ArrayList<>();
List<BigInteger> dateRejected = new ArrayList<>();
addToLists(rejectedDetails, r -> r.getAd().getId(), empIdRejected, dateRejected);
A similar approach as #roookeee already posted with but maybe slightly more concise would be to store the mappings using mapping functions declared as :
Function<String, Integer> extractEmployeeId = empId -> Integer.valueOf(empId.split("-")[2]);
Function<String, BigInteger> extractDate = empId -> new BigInteger(empId.split("-")[1]);
then proceed with mapping as:
Map<Integer, BigInteger> acceptedDetailMapping = Arrays.stream(acceptedDetails)
.collect(Collectors.toMap(a -> extractEmployeeId.apply(a.getId()),
a -> extractDate.apply(a.getId())));
Map<Integer, BigInteger> rejectedDetailMapping = Arrays.stream(rejectedDetails)
.collect(Collectors.toMap(a -> extractEmployeeId.apply(a.getAd().getId()),
a -> extractDate.apply(a.getAd().getId())));
Hereafter you can also access the date of acceptance or rejection corresponding to the employeeId of the employee as well.
How about this:
class EmployeeValidationResult {
//constructor + getters omitted for brevity
private final BigInteger date;
private final Integer employeeId;
}
List<EmployeeValidationResult> accepted = Stream.of(acceptedDetails)
.filter(Objects:nonNull)
.map(this::extractValidationResult)
.collect(Collectors.toList());
List<EmployeeValidationResult> rejected = Stream.of(rejectedDetails)
.filter(Objects:nonNull)
.map(this::extractValidationResult)
.collect(Collectors.toList());
EmployeeValidationResult extractValidationResult(EmployeeValidationAccepted accepted) {
return extractValidationResult(accepted.getId());
}
EmployeeValidationResult extractValidationResult(EmployeeValidationRejected rejected) {
return extractValidationResult(rejected.getAd().getId());
}
EmployeeValidationResult extractValidationResult(String id) {
String[] empIdList = id.split("-");
BigInteger date = extractDate(empIdList[1])
Integer empId = extractId(empIdList[2]);
return new EmployeeValidationResult(date, employeeId);
}
Repeating the filter or map operations is good style and explicit about what is happening. Merging the two lists of objects into one and using instanceof clutters the implementation and makes it less readable / maintainable.

How to convert a for iteration with conditions to Java 8 stream

Currently, I have this method, which I want to convert to a Java 8 stream style (I have little practice with this API btw, that's the purpose of this little exercise):
private static Map<Integer, List<String>> splitByWords(List<String> list) {
for (int i = 0; i < list.size(); i++) {
if(list.get(i).length() > 30 && list.get(i).contains("-")) {
mapOfElements.put(i, Arrays.stream(list.get(i).split("-")).collect(Collectors.toList()));
} else if(list.get(i).length() > 30) {
mapOfElements.put(i, Arrays.asList(new String[]{list.get(i)}));
} else {
mapOfElements.put(i, Arrays.asList(new String[]{list.get(i) + "|"}));
}
}
return mapOfElements;
}
This is what I´ve got so far:
private static Map<Integer, List<String>> splitByWords(List<String> list) {
Map<Integer, List<String>> mapOfElements = new HashMap<>();
IntStream.range(0, list.size())
.filter(i-> list.get(i).length() > 30 && list.get(i).contains("-"))
.boxed()
.map(i-> mapOfElements.put(i, Arrays.stream(list.get(i).split("-")).collect(Collectors.toList())));
//Copy/paste the above code twice, just changing the filter() and map() functions?
In the "old-fashioned" way, I just need one for iteration to do everything I need regarding my conditions. Is there a way to achieve that using the Stream API or, if I want to stick to it, I have to repeat the above code just changing the filter() and map() conditions, therefore having three for iterations?
The current solution with the for-loop looks good. As you have to distinguish three cases only, there is no need to generalize the processing.
Should there be more cases to distinguish, then it could make sense to refactor the code. My approach would be to explicitly define the different conditions and their corresponding string processing. Let me explain it using the code from the question.
First of all I'm defining the different conditions using an enum.
public enum StringClassification {
CONTAINS_HYPHEN, LENGTH_GT_30, DEFAULT;
public static StringClassification classify(String s) {
if (s.length() > 30 && s.contains("-")) {
return StringClassification.CONTAINS_HYPHEN;
} else if (s.length() > 30) {
return StringClassification.LENGTH_GT_30;
} else {
return StringClassification.DEFAULT;
}
}
}
Using this enum I define the corresponding string processors:
private static final Map<StringClassification, Function<String, List<String>>> PROCESSORS;
static {
PROCESSORS = new EnumMap<>(StringClassification.class);
PROCESSORS.put(StringClassification.CONTAINS_HYPHEN, l -> Arrays.stream(l.split("-")).collect(Collectors.toList()));
PROCESSORS.put(StringClassification.LENGTH_GT_30, l -> Arrays.asList(new String[] { l }));
PROCESSORS.put(StringClassification.DEFAULT, l -> Arrays.asList(new String[] { l + "|" }));
}
Based on this I can do the whole processing using the requested IntStream:
private static Map<Integer, List<String>> splitByWords(List<String> list) {
return IntStream.range(0, list.size()).boxed()
.collect(Collectors.toMap(Function.identity(), i -> PROCESSORS.get(StringClassification.classify(list.get(i))).apply(list.get(i))));
}
The approach is to retrieve for a string the appropriate StringClassification and then in turn the corresponding string processor. The string processors are implementing the strategy pattern by providing a Function<String, List<String>> which maps a String to a List<String> according to the StringClassification.
A quick example:
public static void main(String[] args) {
List<String> list = Arrays.asList("123",
"1-2",
"0987654321098765432109876543211",
"098765432109876543210987654321a-b-c");
System.out.println(splitByWords(list));
}
The output is:
{0=[123|], 1=[1-2|], 2=[0987654321098765432109876543211], 3=[098765432109876543210987654321a, b, c]}
This makes it easy to add or to remove conditions and string processors.
First of I don't see any reason to use the type Map<Integer, List<String>> when the key is an index. Why not use List<List<String>> instead? If you don't use a filter the elements should be on the same index as the input.
The power in a more functional approach is that it's more readable what you're doing. Because you want to do multiple things for multiple sizes strings it's pretty hard write a clean solution. You can however do it in a single loop:
private static List<List<String>> splitByWords(List<String> list)
{
return list.stream()
.map(
string -> string.length() > 30
? Arrays.asList(string.split("-"))
: Arrays.asList(string + "|")
)
.collect(Collectors.toList());
}
You can add more complex logic by making your lambda multiline (not needed in this case). eg.
.map(string -> {
// your complex logic
// don't forget, when using curly braces you'll
// need to return explicitly
return result;
})
The more functional approach would be to group the strings by size followed by applying a specific handler for the different groups. It's pretty hard to keep the index the same, so I change the return value to Map<String, List<String>> so the result can be fetched by providing the original string:
private static Map<String, List<String>> splitByWords(List<String> list)
{
Map<String, List<String>> result = new HashMap<>();
Map<Boolean, List<String>> greaterThan30;
// group elements
greaterThan30 = list.stream().collect(Collectors.groupingBy(
string -> string.length() > 30
));
// handle strings longer than 30 chars
result.putAll(
greaterThan30.get(true).stream().collect(Collectors.toMap(
Function.identity(), // the same as: string -> string
string -> Arrays.asList(string.split("-"))
))
);
// handle strings not longer than 30 chars
result.putAll(
greaterThan30.get(false).stream().collect(Collectors.toMap(
Function.identity(), // the same as: string -> string
string -> Arrays.asList(string + "|")
))
);
return result;
}
The above seems like a lot of hassle, but is in my opinion better understandable. You could also dispatch the logic to handle large and small strings to other methods, knowing the provided string does always match the criteria.
This is slower than the first solution. For a list of size n, it has to loop through n elements to group by the criteria. Then loop through x (0 <= x <= n) elements that match the criteria, followed by a loop through n - x elements that don't match the criteria. (In total 2 times the whole list.)
In this case it might not be worth the trouble since both the criteria, as well as the logic to apply are pretty simple.

How to print a list by adding a new line after every 3rd element in a list in java lambda expression?

Suppose I have a list as below
Collection<?> mainList = new ArrayList<String>();
mainList=//some method call//
Currently, I am displaying the elements in the list as
System.out.println(mainList.stream().map(Object::toString).collect(Collectors.joining(",")).toString());
And I got the result as
a,b,c,d,e,f,g,h,i
How to print this list by adding a new line after every 3rd element in a list in java, so that it will print the result as below
a,b,c
d,e,f
g,h,i
Note: This is similar to How to Add newline after every 3rd element in arraylist in java?.But there formatting the file is done while reading itself.
I want to do it while printing the output.
If you want to stick to Java Stream API then your problem can be solved by partitioning initial list to sublists of size 3 and then representing each sublist as a String and joining results with \n.
import java.util.AbstractMap;
import java.util.Arrays;
import java.util.Collection;
import java.util.List;
import java.util.concurrent.atomic.AtomicInteger;
import java.util.stream.Collectors;
final class PartitionListExample {
public static void main(String[] args) {
final Collection<String> mainList = Arrays.asList("a", "b", "c", "d", "e", "f", "g", "h", "i");
final AtomicInteger idx = new AtomicInteger(0);
final int size = 3;
// Partition a list into list of lists size 3
final Collection<List<String>> rows = mainList.stream()
.collect(Collectors.groupingBy(
it -> idx.getAndIncrement() / size
))
.values();
// Write each row in the new line as a string
final String result = rows.stream()
.map(row -> String.join(",", row))
.collect(Collectors.joining("\n"));
System.out.println(result);
}
}
There are 3rd party libraries that provide utility classes that makes list partitioning easier (e.g. Guava or Apache Commons Collections) but this solution is built on Java 8 SDK only.
What it does is:
firstly we collect all elements by grouping by assigned row index and we store values as a list (e.g. {0=[a,b,c],1=[d,e,f],2=[g,h,i]}
then we take a list of all values like [[a,b,c],[d,e,f],[g,h,i]]
finally we represent list of lists as a String where each row is separated by \n
Output Demo
Running following program will print to console following output:
a,b,c
d,e,f
g,h,i
Getting more from following example
Alnitak played even more with following example and came up with a shorter solution by utilizing Collectors.joining(",") in .groupingBy collector and using String.join("\n", rows) in the end instead of triggering another stream reduction.
final Collection<String> rows = mainList.stream()
.collect(Collectors.groupingBy(
it -> idx.getAndIncrement() / size,
Collectors.joining(",")
))
.values();
// Write each row in the new line as a string
final String result = String.join("\n", rows);
System.out.println(result);
}
Final note
Keep in mind that this is not the most efficient way to print list of elements in your desired format. But partitioning list of any elements gives you flexibility if it comes to creating final result and is pretty easy to read and understand.
A side remark : in your actual code, map(Object::toString) could be removed if you replace
Collection<?> mainList = new ArrayList<String>(); by
Collection<String> mainList = new ArrayList<String>();.
If you manipulate Strings, create a Collection of String rather than Collection of ?.
But there formatting the file is done while reading itself.I want to
do it while printing the output.
After gotten the joined String, using replaceAll("(\\w*,\\w*,\\w*,)", "$1" + System.lineSeparator()) should do the job.
Iit will search and replace all series of 3 characters or more separated by a , character by the same thing ($1-> group capturing) but by concatenating it with a line separator.
Besides this :
String collect = mainList.stream().collect(Collectors.joining(","));
could be simplified by :
String collect = String.join(",", mainList);
Sample code :
public static void main(String[] args) {
Collection<String> mainList = Arrays.asList("a","b","c","d","e","f","g","h","i", "j");
String formattedValues = String.join(",", mainList).replaceAll("(\\w*,\\w*,\\w*,)", "$1" + System.lineSeparator());
System.out.println(formattedValues);
}
Output :
a,b,c,
d,e,f,
g,h,i,
j
Another approach that hasn't been answered here is to create a custom Collector.
import java.util.*;
import java.util.function.BiConsumer;
import java.util.function.BinaryOperator;
import java.util.function.Function;
import java.util.function.Supplier;
import java.util.stream.Collector;
import java.util.stream.Collectors;
public class PartitionListInPlace {
static class MyCollector implements Collector<String, List<List<String>>, String> {
private final List<List<String>> buckets;
private final int bucketSize;
public MyCollector(int numberOfBuckets, int bucketSize) {
this.bucketSize = bucketSize;
this.buckets = new ArrayList<>(numberOfBuckets);
for (int i = 0; i < numberOfBuckets; i++) {
buckets.add(new ArrayList<>(bucketSize));
}
}
#Override
public Supplier<List<List<String>>> supplier() {
return () -> this.buckets;
}
#Override
public BiConsumer<List<List<String>>, String> accumulator() {
return (buckets, element) -> buckets
.stream()
.filter(x -> x.size() < bucketSize)
.findFirst()
.orElseGet(() -> {
ArrayList<String> nextBucket = new ArrayList<>(bucketSize);
buckets.add(nextBucket);
return nextBucket;
})
.add(element);
}
#Override
public BinaryOperator<List<List<String>>> combiner() {
return (b1, b2) -> {
throw new UnsupportedOperationException();
};
}
#Override
public Function<List<List<String>>, String> finisher() {
return buckets -> buckets.stream()
.map(x -> x.stream()
.collect(Collectors.joining(", ")))
.collect(Collectors.joining(System.lineSeparator()));
}
#Override
public Set<Characteristics> characteristics() {
return new HashSet<>();
}
}
public static void main(String[] args) {
Collection<String> mainList = Arrays.asList("a","b","c","d","e","f","g","h","i", "j");
String formattedValues = mainList
.stream()
.collect(new MyCollector(mainList.size() / 3, 3));
System.out.println(formattedValues);
}
}
Explanation
This is a mutable collector that should not be used in parallel. If your necessities require that you process the stream in parallel you will have to transform this collector to be thread safe, which is pretty easy if you don't care about the order of the elements.
The combiner throws an exception because it is never called since run the stream sequentially.
The set of Characteristics has none that interests us, you can verify this by reading the javadoc
The supplier method will fetch the bucket in which we want to insert the element. The element will be insert in the next bucket that has space, otherwise we will create a new bucket and add it there.
The finisher is quite simple: Join the contents of each bucket by , and join the buckets themselves with System.lineSeparator()
Remember
Do not use this collector to process
Output
a, b, c
d, e, f
g, h, i
j

Categories