Adding non-duplicated elements to existing keys in java 8 functional style - java

I have a map I want to populate:
private Map<String, Set<String>> myMap = new HashMap<>();
with this method:
private void compute(String key, String[] parts) {
myMap.computeIfAbsent(key, k -> getMessage(parts));
}
compute() is invoked as follows:
for (String line : messages) {
String[] parts = line.split("-");
validator.validate(parts); //validates parts are as expected
String key = parts[parts.length - 1];
compute(key, parts);
}
parts elements are like this:
[AB, CC, 123]
[AB, FF, 123]
[AB, 456]
In the compute() method, as you can see I am trying to use the last part of the element of the array as a key and the other parts to be used as values for the map I am looking to build.
My Question: How do I add to existing key only the unique values using Java 8 functional style e.g.
{123=[AB, FF, CC]}

As you requested I added a lambda variant, which just adds the parts via lambda to the map in the compute-method:
private void compute(String key, String[] parts) {
myMap.computeIfAbsent(key,
s -> Stream.of(parts)
.limit(parts.length - 1)
.collect(toSet()));
}
But in this case you will only get something like 123=[AB, CC] in your map. Use merge instead, if you want to add also all values which come on subsequent calls:
private void compute(String key, String[] parts) {
myMap.merge(key,
s -> Stream.of(parts)
.limit(parts.length - 1)
.collect(toSet()),
(currentSet, newSet) -> {currentSet.addAll(newSet); return currentSet;});
}
I am not sure what you intend with computeIfAbsent, but from what you listed as parts and what you expect as output, you may also want to try the following instead of the whole code you listed :
// the function to identify your key
Function<String[], String> keyFunction = strings -> strings[strings.length - 1];
// the function to identify your values
Function<String[], List<String>> valuesFunction = strings -> Arrays.asList(strings).subList(0, strings.length - 1);
// a collector to add all entries of a collection to a (sorted) TreeSet
Collector<List<String>, TreeSet<Object>, TreeSet<Object>> listTreeSetCollector = Collector.of(TreeSet::new, TreeSet::addAll, (left, right) -> {
left.addAll(right);
return left;
});
Map myMap = Arrays.stream(messages) // or: messages.stream()
.map(s -> s.split("-"))
.peek(validator::validate)
.collect(Collectors.groupingBy(keyFunction,
Collectors.mapping(valuesFunction, listTreeSetCollector)));
Using your samples as input you get the result you mentioned (well, actually sorted, as I used a TreeSet).
String[] messages = new String[]{
"AB-CC-123",
"AB-FF-123",
"AB-456"};
produces a map containing:
123=[AB, CC, FF]
456=[AB]
Last, but not least: if you can, pass the key and the values themselves to your method. Don't split the logic about identifying the key and identifying the values. That makes it really hard to understand your code later on or by someone else.

Try this:
private void compute(String[] parts) {
int lastIndex = parts.length - 1;
String key = parts[lastIndex];
List<String> values = Arrays.asList(parts).subList(0, lastIndex);
myMap.computeIfAbsent(key, k -> new HashSet<>()).addAll(values);
}
Or if you want, you can replace the entire loop with a stream:
Map<String, Set<String>> myMap = messages.stream() // if messages is an array, use Arrays.stream(messages)
.map(line -> line.split("-"))
.peek(validator::validate)
.collect(Collectors.toMap(
parts -> parts[parts.length - 1],
parts -> new HashSet<>(Arrays.asList(parts).subList(0, parts.length - 1)),
(a, b) -> { a.addAll(b); return a; }));

To add more parts to a possibly existing key you're using the wrong method; you want merge(), not computeIfAbsent().
If validator.valudate() throws a checked Exception, you must call it outside a stream, so you'll need a foreach loop:
for (String message : messages) {
String[] parts = message.split("-");
validator.validate(parts);
LinkedList<String> list = new LinkedList(Arrays.asList(parts));
String key = list.getLast();
list.removeLast();
myMap.merge(key, new HashSet<>(list), Set::addAll);
}
Using a LinkedList, which has methods getLast() and removeLast(), makes the code very readable.
Disclaimer: Code may not compile or work as it was thumbed in on my phone (but there's a reasonable chance it will work)

Related

Java-Stream - Split, group and map the data from a String using a single Stream

I have a string as below:
String data = "010$$fengtai,010$$chaoyang,010$$haidain,027$$wuchang,027$$hongshan,027$$caidan,021$$changnin,021$$xuhui,020$$tianhe";
And I want to convert it into a map of type Map<String,List<String>> (like shown below) by performing the following steps:
first split the string by , and then split by $$;
the substring before $$ would serve as a Key while grouping the data, and the substring after $$ needs to placed inside into a list, which would be a Value of the Map.
Example of the resulting Map:
{
027=[wuchang, hongshan, caidan],
020=[tianhe],
010=[fengtai, chaoyang, haidain],
021=[changnin, xuhui]
}
I've used a traditional way of achieving this:
private Map<String, List<String>> parseParametersByIterate(String sensors) {
List<String[]> dataList = Arrays.stream(sensors.split(","))
.map(s -> s.split("\\$\\$"))
.collect(Collectors.toList());
Map<String, List<String>> resultMap = new HashMap<>();
for (String[] d : dataList) {
List<String> list = resultMap.get(d[0]);
if (list == null) {
list = new ArrayList<>();
list.add(d[1]);
resultMap.put(d[0], list);
} else {
list.add(d[1]);
}
}
return resultMap;
}
But it seems more complicated and verbose. Thus, I want to implement this logic one-liner (i.e. a single stream statement).
What I have tried so far is below
Map<String, List<String>> result = Arrays.stream(data.split(","))
.collect(Collectors.groupingBy(s -> s.split("\\$\\$")[0]));
But the output doesn't match the one I want to have. How can I generate a Map structured as described above?
You simply need to map the values of the mapping. You can do that by specifying a second argument to Collectors.groupingBy:
Collectors.groupingBy(s -> s.split("\\$\\$")[0],
Collectors.mapping(s -> s.split("\\$\\$")[1],
Collectors.toList()
))
Instead of then splitting twice, you can split first and group afterwards:
Arrays.stream(data.split(","))
.map(s -> s.split("\\$\\$"))
.collect(Collectors.groupingBy(s -> s[0],
Collectors.mapping(s -> s[1],Collectors.toList())
));
Which now outputs:
{027=[wuchang, hongshan, caidan], 020=[tianhe], 021=[changnin, xuhui], 010=[fengtai, chaoyang, haidain]}
You can extract the required information from the string without allocating intermediate arrays and by iterating over the string only once and also employing the regex engine only once instead of doing multiple String.split() calls and splitting first by coma , then by $$. We can get all the needed data in one go.
Since you're already using regular expressions (because interpreting "\\s\\s" requires utilizing the regex engine), it would be wise to leverage them to the full power.
Matcher.results()
We can define the following Pattern that captures the pieces of you're interested in:
public static final Pattern DATA = // use the proper name to describe a piece of information (like "027$$hongshan") that the pattern captures
Pattern.compile("(\\d+)\\$\\$(\\w+)");
Using this pattern, we can produce an instance of Matcher and apply Java 9 method Matcher.result(), which produces a stream of MatchResults.
MatchResult is an object encapsulating information about the captured sequence of characters. We can access the groups using method MatchResult.group().
private static Map<String, List<String>> parseParametersByIterate(String sensors) {
return DATA.matcher(sensors).results() // Stream<MatchResult>
.collect(Collectors.groupingBy(
matchResult -> matchResult.group(1), // extracting "027" from "027$$hongshan"
Collectors.mapping(
matchResult -> matchResult.group(2), // extracting "hongshan" from "027$$hongshan"
Collectors.toList())
));
}
main()
public static void main(String[] args) {
String data = "010$$fengtai,010$$chaoyang,010$$haidain,027$$wuchang,027$$hongshan,027$$caidan,021$$changnin,021$$xuhui,020$$tianhe";
parseParametersByIterate(data)
.forEach((k, v) -> System.out.println(k + " -> " + v));
}
Output:
027 -> [wuchang, hongshan, caidan]
020 -> [tianhe]
021 -> [changnin, xuhui]
010 -> [fengtai, chaoyang, haidain]

Generate a Map from a list using Streams with Java 8

I have a list of String.
I want to store each string as key and the string's length as value in a Map (say HashMap).
I'm not able to achieve it.
List<String> ls = Arrays.asList("James", "Sam", "Scot", "Elich");
Map<String,Integer> map = new HashMap<>();
Function<String, Map<String, Integer>> fs = new Function<>() {
#Override
public Map<String, Integer> apply(String s) {
map.put(s,s.length());
return map;
}
};
Map<String, Integer> nmap = ls
.stream()
.map(fs).
.collect(Collectors.toMap()); //Lost here
System.out.println(nmap);
All strings are unique.
There's no need to wrap each and every string with its own map, as the function you've created does.
Instead, you need to provide proper arguments while calling Collectors.toMap() :
keyMapper - a function responsible for extracting a key from the stream element.
valueMapper - a function that generates a value from the stream element.
Hence, you need the stream element itself to be a key we can use Function.identity(), which is more descriptive than lambda str -> str, but does precisely the same.
Map<String,Integer> lengthByStr = ls.stream()
.collect(Collectors.toMap(
Function.identity(), // extracting a key
String::length // extracting a value
));
In case when the source list might contain duplicates, you need to provide the third argument - mergeFunction that will be responsible for resolving duplicates.
Map<String,Integer> lengthByStr = ls.stream()
.collect(Collectors.toMap(
Function.identity(), // key
String::length, // value
(left, right) -> left // resolving duplicates
));
You said there would be no duplicate Strings. But if one gets by you can use distinct() (which internally uses set) to ensure it doesn't cause issues.
a-> a is a shorthand for using the stream value. Essentially a lambda that returns its argument.
distinct() removes any duplicate strings
Map<String, Integer> result = names.stream().distinct()
.collect(Collectors.toMap(a -> a, String::length));
If you want to get the length of a String, you can do it immediately as someString.length(). But suppose you want to get a map of all the Strings keyed by a particular length. You can do it using Collectors.groupingBy() which by default puts duplicates in a list. In this case, the duplicate would be the length of the String.
use the length of the string as a key.
the value will be a List<String> to hold all strings that match that length.
List<String> names = List.of("James", "Sam", "Scot",
"Elich", "lucy", "Jennifer","Bob", "Joe", "William");
Map<Integer, List<String>> lengthMap = names.stream()
.distinct()
.collect(Collectors.groupingBy(String::length));
lengthMap.entrySet().forEach(System.out::println);
prints
3=[Sam, Bob, Joe]
4=[Scot, lucy]
5=[James, Elich]
7=[William]
8=[Jennifer]

Split string into parts in java to put it in a map

I have been trying to get all the values from a string and put them in a map in the following manner:
So I have a string which is like this:
String cookies = "i=lol;haha=noice;df3=ddtb;"
So far I have been trying this out:
final Map<String, String> map = new HashMap<>();
map.put(cookies.split(";")[0].split("=")[0], cookies.split(";")[0].split("=")[1]);
But this way I can only put one value in and it is quite long and ugly. Is there any was to due this with regex or a loop?
You could use a loop to iterate over the key value pairs and put them into the map:
String[] cookieArr = cookies.split(";");
for(String cookieString : cookieArr){
String[] pair = cookieString.split("=");
if(pair.length < 2){
continue;
}
map.put(pair[0], pair[1]);
}
The if is only there to prevent ArrayIndexOutOfBounds expcetions if cookie string is malformed
an alternativ would be using a stream:
Arrays.stream(cookies.split(";")).forEach(cookieStr -> map.put(cookieStr.split("=")[0], cookieStr.split("=")[1]));
As mentioned by #WJS in the comment, you could use map.putIfAbsent(key, vlaue) instead of map.put(key, value) to prevent overriding of values. But in case of cookies it could be a desired behavior to overwrite the old value with the new.
You could do it like this. It presumes your format is consistent.
first splits each k/v pair on ";"
the splits on "=" into key and value.
and adds to map.
if duplicate keys show up, the first one encountered takes precedence (if you want the latest value for a duplicate key then use (a, b)-> b as the merge lambda.)
String cookies = "i=lol;haha=noice;df3=ddtb";
Map<String, String> map = Arrays.stream(cookies.split(";"))
.map(str -> str.split("=")).collect(Collectors
.toMap(a -> a[0], a->a[1], (a, b) -> a));
map.entrySet().forEach(System.out::println);
Prints
df3=ddtb
haha=noice
i=lol

How to convert a for iteration with conditions to Java 8 stream

Currently, I have this method, which I want to convert to a Java 8 stream style (I have little practice with this API btw, that's the purpose of this little exercise):
private static Map<Integer, List<String>> splitByWords(List<String> list) {
for (int i = 0; i < list.size(); i++) {
if(list.get(i).length() > 30 && list.get(i).contains("-")) {
mapOfElements.put(i, Arrays.stream(list.get(i).split("-")).collect(Collectors.toList()));
} else if(list.get(i).length() > 30) {
mapOfElements.put(i, Arrays.asList(new String[]{list.get(i)}));
} else {
mapOfElements.put(i, Arrays.asList(new String[]{list.get(i) + "|"}));
}
}
return mapOfElements;
}
This is what I´ve got so far:
private static Map<Integer, List<String>> splitByWords(List<String> list) {
Map<Integer, List<String>> mapOfElements = new HashMap<>();
IntStream.range(0, list.size())
.filter(i-> list.get(i).length() > 30 && list.get(i).contains("-"))
.boxed()
.map(i-> mapOfElements.put(i, Arrays.stream(list.get(i).split("-")).collect(Collectors.toList())));
//Copy/paste the above code twice, just changing the filter() and map() functions?
In the "old-fashioned" way, I just need one for iteration to do everything I need regarding my conditions. Is there a way to achieve that using the Stream API or, if I want to stick to it, I have to repeat the above code just changing the filter() and map() conditions, therefore having three for iterations?
The current solution with the for-loop looks good. As you have to distinguish three cases only, there is no need to generalize the processing.
Should there be more cases to distinguish, then it could make sense to refactor the code. My approach would be to explicitly define the different conditions and their corresponding string processing. Let me explain it using the code from the question.
First of all I'm defining the different conditions using an enum.
public enum StringClassification {
CONTAINS_HYPHEN, LENGTH_GT_30, DEFAULT;
public static StringClassification classify(String s) {
if (s.length() > 30 && s.contains("-")) {
return StringClassification.CONTAINS_HYPHEN;
} else if (s.length() > 30) {
return StringClassification.LENGTH_GT_30;
} else {
return StringClassification.DEFAULT;
}
}
}
Using this enum I define the corresponding string processors:
private static final Map<StringClassification, Function<String, List<String>>> PROCESSORS;
static {
PROCESSORS = new EnumMap<>(StringClassification.class);
PROCESSORS.put(StringClassification.CONTAINS_HYPHEN, l -> Arrays.stream(l.split("-")).collect(Collectors.toList()));
PROCESSORS.put(StringClassification.LENGTH_GT_30, l -> Arrays.asList(new String[] { l }));
PROCESSORS.put(StringClassification.DEFAULT, l -> Arrays.asList(new String[] { l + "|" }));
}
Based on this I can do the whole processing using the requested IntStream:
private static Map<Integer, List<String>> splitByWords(List<String> list) {
return IntStream.range(0, list.size()).boxed()
.collect(Collectors.toMap(Function.identity(), i -> PROCESSORS.get(StringClassification.classify(list.get(i))).apply(list.get(i))));
}
The approach is to retrieve for a string the appropriate StringClassification and then in turn the corresponding string processor. The string processors are implementing the strategy pattern by providing a Function<String, List<String>> which maps a String to a List<String> according to the StringClassification.
A quick example:
public static void main(String[] args) {
List<String> list = Arrays.asList("123",
"1-2",
"0987654321098765432109876543211",
"098765432109876543210987654321a-b-c");
System.out.println(splitByWords(list));
}
The output is:
{0=[123|], 1=[1-2|], 2=[0987654321098765432109876543211], 3=[098765432109876543210987654321a, b, c]}
This makes it easy to add or to remove conditions and string processors.
First of I don't see any reason to use the type Map<Integer, List<String>> when the key is an index. Why not use List<List<String>> instead? If you don't use a filter the elements should be on the same index as the input.
The power in a more functional approach is that it's more readable what you're doing. Because you want to do multiple things for multiple sizes strings it's pretty hard write a clean solution. You can however do it in a single loop:
private static List<List<String>> splitByWords(List<String> list)
{
return list.stream()
.map(
string -> string.length() > 30
? Arrays.asList(string.split("-"))
: Arrays.asList(string + "|")
)
.collect(Collectors.toList());
}
You can add more complex logic by making your lambda multiline (not needed in this case). eg.
.map(string -> {
// your complex logic
// don't forget, when using curly braces you'll
// need to return explicitly
return result;
})
The more functional approach would be to group the strings by size followed by applying a specific handler for the different groups. It's pretty hard to keep the index the same, so I change the return value to Map<String, List<String>> so the result can be fetched by providing the original string:
private static Map<String, List<String>> splitByWords(List<String> list)
{
Map<String, List<String>> result = new HashMap<>();
Map<Boolean, List<String>> greaterThan30;
// group elements
greaterThan30 = list.stream().collect(Collectors.groupingBy(
string -> string.length() > 30
));
// handle strings longer than 30 chars
result.putAll(
greaterThan30.get(true).stream().collect(Collectors.toMap(
Function.identity(), // the same as: string -> string
string -> Arrays.asList(string.split("-"))
))
);
// handle strings not longer than 30 chars
result.putAll(
greaterThan30.get(false).stream().collect(Collectors.toMap(
Function.identity(), // the same as: string -> string
string -> Arrays.asList(string + "|")
))
);
return result;
}
The above seems like a lot of hassle, but is in my opinion better understandable. You could also dispatch the logic to handle large and small strings to other methods, knowing the provided string does always match the criteria.
This is slower than the first solution. For a list of size n, it has to loop through n elements to group by the criteria. Then loop through x (0 <= x <= n) elements that match the criteria, followed by a loop through n - x elements that don't match the criteria. (In total 2 times the whole list.)
In this case it might not be worth the trouble since both the criteria, as well as the logic to apply are pretty simple.

Create all possible combinations of elements

I need to create all possible combinations of some kind of Key, that is composed from X (in my case, 8), equally important elements. So i came up with code like this:
final LinkedList<Key> keys = new LinkedList();
firstElementCreator.getApplicableElements() // All creators return a Set of elements
.forEach( first -> secondElementCreator.getApplicableElements()
.forEach( second -> thirdElementCreator.getApplicableElements()
// ... more creators
.forEach( X -> keys.add( new Key( first, second, third, ..., X ) ) ) ) ) ) ) ) );
return keys;
and it's working, but there is X nested forEach and i have feeling that i'm missing out an easier/better/more elegant solution. Any suggestions?
Thanks in advance!
Is it Cartesian Product? Many libraries provide the API, for example: Sets and Lists in Guava:
List<ApplicableElements> elementsList = Lists.newArrayList(firstElementCreator, secondElementCreator...).stream()
.map(c -> c.getApplicableElements()).collect(toList());
List<Key> keys = Lists.cartesianProduct(elementsList).stream()
.map(l -> new Key(l.get(0), l.get(1), l.get(2), l.get(3), l.get(4), l.get(5), l.get(6), l.get(7))).collect(toList());
Since the number of input sets is fixed (it has to match the number of arguments in the Key constructor), your solution is actually not bad.
It's more efficient and easier to read without the lambdas, though, like:
for (Element first : firstElementCreator.getApplicableElements()) {
for (Element second : secondElementCreator.getApplicableElements()) {
for (Element third : thirdElementCreator.getApplicableElements()) {
keys.add(new Key(first, second, third));
}
}
}
The canonical solution is to use flatMap. However, the tricky part is to create the Key object from the multiple input levels.
The straight-forward approach is to do the evaluation in the innermost function, where every value is in scope
final List<Key> keys = firstElementCreator.getApplicableElements().stream()
.flatMap(first -> secondElementCreator.getApplicableElements().stream()
.flatMap(second -> thirdElementCreator.getApplicableElements().stream()
// ... more creators
.map( X -> new Key( first, second, third, ..., X ) ) ) )
.collect(Collectors.toList());
but this soon becomes impractical with deep nesting
A solution without deep nesting requires elements to hold intermediate compound values. E.g. if we define Key as
class Key {
String[] data;
Key(String... arg) {
data=arg;
}
public Key add(String next) {
int pos = data.length;
String[] newData=Arrays.copyOf(data, pos+1);
newData[pos]=next;
return new Key(newData);
}
#Override
public String toString() {
return "Key("+Arrays.toString(data)+')';
}
}
(assuming String as element type), we can use
final List<Key> keys =
firstElementCreator.getApplicableElements().stream().map(Key::new)
.flatMap(e -> secondElementCreator.getApplicableElements().stream().map(e::add))
.flatMap(e -> thirdElementCreator.getApplicableElements().stream().map(e::add))
// ... more creators
.collect(Collectors.toList());
Note that these flatMap steps are now on the same level, i.e. not nested anymore. Also, all these steps are identical, only differing in the actual creator, which leads to the general solution supporting an arbitrary number of Creator instances.
List<Key> keys = Stream.of(firstElementCreator, secondElementCreator, thirdElementCreator
/* , and, some, more, if you like */)
.map(creator -> (Function<Key,Stream<Key>>)
key -> creator.getApplicableElements().stream().map(key::add))
.reduce(Stream::of, (f1,f2) -> key -> f1.apply(key).flatMap(f2))
.apply(new Key())
.collect(Collectors.toList());
Here, every creator is mapping to the identical stream-producing function of the previous solution, then all are reduced to a single function combining each function with a flatMap step to the next one, and finally the resulting function is executed to get a stream, which is then collected to a List.

Categories