I need to find 1st free index in my file system having stream of names as source.
Consider list: ["New2", "New4", "New0", "New1", ...]
1st unused index of those will be 3.
int index = 0;
try (IntStream indexes = names.stream()
.filter(name -> name.startsWith("New"))
.mapToInt(Integer::parseInt)
.distinct()
.sorted())
{
// I was thinking about making possible indexes stream, removing existig ones from try-with-resource block, and getting .min().
IntStream.rangeClosed(0, 10)... // Idk what to do.
}
I am asking someone to help me find right syntax for my idea or propose better solution.
The most efficient way is to collect into a BitSet:
int first = names.stream()
.filter(name -> name.startsWith("New"))
.mapToInt(s -> Integer.parseInt(s.substring(3)))
.collect(BitSet::new, BitSet::set, BitSet::or).nextClearBit(0);
Note that the bits are intrinsically sorted and distinct. Also, there will always be a “free” index. If there is no gap between 0 and the maximum number, the next free will be maximum+1, if there are no matching elements at all, the next free will be zero.
Starting with Java 9, we can do even more efficient with
int first = names.stream()
.filter(name -> name.startsWith("New"))
.mapToInt(s -> Integer.parseInt(s, 3, s.length(), 10))
.collect(BitSet::new, BitSet::set, BitSet::or).nextClearBit(0);
which parses the relevant part of the string directly, saving the substring operation.
You could:
Extract the numeric part from each name
Store the used indexes in a set
Iterate over the range from 0 until the size of the list
The first index not in the used set is available
For example like this:
List<String> names = Arrays.asList("New2", "New4", "New0", "New1");
Set<Integer> taken = names.stream()
.map(s -> s.replaceAll("\\D+", ""))
.map(Integer::parseInt)
.collect(Collectors.toSet());
int first = IntStream.range(0, names.size())
.filter(index -> !taken.contains(index))
.findFirst()
.orElse(names.size());
For the fun of it, if you know you have up to 63 entries...
private static int firstMissing(List<Long> input) {
if (!input.contains(0L)) {
return 0;
}
long firstMissing = Long.lowestOneBit(~input.stream().reduce(1L, (i, j) -> i | 1L << j));
int result = 0;
while (firstMissing != 0) {
++result;
firstMissing = firstMissing >> 1;
}
return result - 1;
}
That's what #Holger did (+1 from me), but without the extra penalty of using BitSet.
Related
I have code like this, which is supposed to create a Map from an array of integers. The key represents the number of digits.
public static Map<Integer, List<String>> groupByDigitNumbersArray(int[] x) {
return Arrays.stream(x) // array to stream
.filter(n -> n >= 0) // filter negative numbers
.collect(Collectors.groupingBy(n -> Integer.toString((Integer) n).length(), // group by number of digits
Collectors.mapping(d -> (d % 2 == 0 ? "e" : "o") + d,
Collectors.toList()))); // if even e odd o add to list
}
The problem is in the line with mapping().
I'm getting an error:
Operator '%' cannot be applied to 'java.lang.Object', 'int'
Does someone know how to solve this?
The flavor of collect() that expects a Collector as an argument isn't available with primitive streams. Even without a modulus operator %, your code will not compile - comment out the downstream collector of groupingBy() to see what I'm talking about.
You need to apply boxed() operation in order to convert an IntStream into a stream of objects Stream<Integer>.
Your method might look like this:
public static Map<Integer, List<String>> groupByDigitNumbersArray(int[] x) {
return Arrays.stream(x) // creates a stream over the given array
.filter(n -> n >= 0) // retain positive numbers and zero
.boxed() // <- converting IntStream into a Stream<Integer>
.collect(Collectors.groupingBy(
n -> String.valueOf(n).length(), // group by number of digits
Collectors.mapping(d -> (d % 2 == 0 ? "e" : "o") + d, // if even concatinate 'e', if odd 'o'
Collectors.toList()))); // collect to list
}
I've changed the classifier function of groupingBy() to be more readable.
I'm working with List<String> -- it contais a big text. Text looks like:
List<String> lines = Arrays.asList("The first line", "The second line", "Some words can repeat", "The first the second"); //etc
I need to calculate words in it with output:
first - 2
line - 2
second - 2
can - 1
repeat - 1
some - 1
words - 1
Words shorter than 4 symbols should be skipped, that's why "the" and "can" are not in the output. Here I wrote the example, but originally if the word is rare and entry < 20, i should skip this word. Then sort the map by Key in alphabetical order.
Using only streams, without "if", "while" and "for" constructions.
What I have implemented:
Map<String, Integer> wordCount = Stream.of(list)
.flatMap(Collection::stream)
.flatMap(str -> Arrays.stream(str.split("\\p{Punct}| |[0-9]|…|«|»|“|„")))
.filter(str -> (str.length() >= 4))
.collect(Collectors.toMap(
i -> i.toLowerCase(),
i -> 1,
(a, b) -> java.lang.Integer.sum(a, b))
);
wordCount contains Map with words and its entries. But how can I skip rare words? Should I create new stream? If yes, how can I get the value of Map? I tried this, but it's not correct:
String result = Stream.of(wordCount)
.filter(i -> (Map.Entry::getValue > 10));
My calculations shoud return a String:
"word" - number of entries
Thank you!
Given the stream that already done:
List<String> lines = Arrays.asList(
"For the rabbit, it was a bad day.",
"An Antillean rabbit is very abundant.",
"She put the rabbit back in the cage and closed the door securely, then ran away.",
"The rabbit tired of her inquisition and hopped away a few steps.",
"The Dean took the rabbit and went out of the house and away."
);
Map<String, Integer> wordCounts = Stream.of(lines)
.flatMap(Collection::stream)
.flatMap(str -> Arrays.stream(str.split("\\p{Punct}| |[0-9]|…|«|»|“|„")))
.filter(str -> (str.length() >= 4))
.collect(Collectors.toMap(
String::toLowerCase,
i -> 1,
Integer::sum)
);
System.out.println("Original:" + wordCounts);
Original output:
Original:{dean=1, took=1, door=1, very=1, went=1, away=3, antillean=1, abundant=1, tired=1, back=1, then=1, house=1, steps=1, hopped=1, inquisition=1, cage=1, securely=1, rabbit=5, closed=1}
You can do:
String results = wordCounts.entrySet()
.stream()
.filter(wordToCount -> wordToCount.getValue() > 2) // 2 is rare
.sorted(Map.Entry.comparingByKey()).map(wordCount -> wordCount.getKey() + " - " + wordCount.getValue())
.collect(Collectors.joining(", "));
System.out.println(results);
Filtered output:
away - 3, rabbit - 5
You can't exclude any values that are less than rare until you have computed the frequency count.
Here is how I might go about it.
do the frequency count (I chose to do it slightly differently than you).
then stream the entrySet of the map and filter out values less than a certain frequency.
then reconstruct the map using a TreeMap to sort the words in lexical order
List<String> list = Arrays.asList(....);
int wordRarity = 10; // minimum frequency to accept
int wordLength = 4; // minimum word length to accept
Map<String, Long> map = list.stream()
.flatMap(str -> Arrays.stream(
str.split("\\p{Punct}|\\s+|[0-9]|…|«|»|“|„")))
.filter(str -> str.length() >= wordLength)
.collect(Collectors.groupingBy(String::toLowerCase,
Collectors.counting()))
// here is where the rare words are filtered out.
.entrySet().stream().filter(e->e.getValue() > wordRarity)
.collect(Collectors.toMap(Entry::getKey, Entry::getValue,
(a,b)->a,TreeMap::new));
}
Note that the (a,b)->a lambda is a merge function to handle duplicates and is not used. Unfortunately, one cannot specify a Supplier without specifying the merge function.
The easiest way to print them is as follows:
map.entrySet().forEach(e -> System.out.printf("%s - %s%n",
e.getKey(), e.getValue()));
I wish to create a int[] of count for a particular String (comprising of only lowercase English Alphabets) using Java 8 stream API. Where arr[i] denotes the count of i-th character of English dictionary (e.g. arr[0] = count of 'a' in String str while arr[2] = count of 'c' in String str. This can be simply done by:
int[] arr = new int[26];
for(char c : str.toCharArray())
arr[c-'a']++;
Or using IntSream in the 2nd way:
int[] arr = IntStream.range('a','z'+1).map(i -> (int)str.chars().filter(c -> c == i).count()).toArray();
But the problem with the second approach is that the String is traversed 26 times for each of the characters from 'a' to 'z'
Can you suggest a better way of achieving the same using java8-stream API?
PS: I know this can be done using Map but I need int[]
int[] r = str.chars()
.boxed()
.reduce(new int[26],
(a, c) -> { ++a[c - 'a']; return a; },
(a1, a2) -> a1);
You know the former is simpler and better. My answer just proves it's feasible with the Stream API, and doesn't suggest that you should go with it. Personally, I would choose the map approach as the most intuitive one.
As pointed out by #Holger, collect is a better option here
str.chars()
.map(c -> c - 'a')
.collect(() -> new int[26],
(a, i)-> a[i]++,
(a1, a2) -> /* left as an exercise to the reader*/);
if you want to use streams and keep the iterative approach, you could do it as well like this:
final int count[] = new int[26];
test.chars().forEach(c -> count[c-'a']++);
This question already has answers here:
Get last element of Stream/List in a one-liner
(9 answers)
Closed 4 years ago.
I have a list of objects a List<B> structure as follows:
class B {
int x;
String y;
}
Now I want to find last occurrence of a B object b such that b.x=1. I
can do that by simply running a for loop and updating index for each
match. But how to do that in Java 8 I am not getting.
I saw there are
Java 8 APIs for findFirst() and findAny(), but did not find anything
similar for finding last occurrence.
Optional<B> reduce = list.stream()
.filter(b -> b.x == 1)
.reduce((a, b) -> b);
This will take identical objects( having the same value x) and return the next one of them in list.
Using the reduce operation you can achieve it, you start by keeping the element that matches, then for each pair (ordered) keep the second until there is one left and return it
static B getLastBeFromInt(List<B> list, int i){
return list.stream().filter(b -> b.x==i).reduce((first,second) -> second).orElse(null);
}
Workable Demo
More reference
Just like with loop based searches, when you are looking for the last occurence of something, the most efficient solution is to search backwards. Further, if you need to find an index with the Stream API, you have to stream over the indices in the first place:
OptionalInt pos = IntStream.rangeClosed(1-list.size(), 0).map(i -> -i)
.filter(ix -> list.get(ix).x == 1)
.findFirst();
pos.ifPresent(ix -> System.out.println("found "+list.get(ix)+" at "+ix));
Another idea would be running your Stream from the last to the first index and use Stream::findFirst.
Optional<B> lastElement = IntStream.range(0, bs.size())
.mapToObj(i -> bs.get(bs.size() - 1 - i))
.filter(b -> b.x == 1).findFirst();
Java-9 version of finding the index of the last occurence satisfying the provided criteria in filter:
IntStream.iterate(list.size()-1, i -> i >= 0, i -> i - 1)
.filter(i -> list.get(i).x == 1)
.findFirst()
.ifPresent(i -> System.out.println("found "+list.get(i)+" at "+i));
I have a text files that have a lot of string lines in there. If I want to find lines before and after a matching in grep, I will do like this:
grep -A 10 -B 10 "ABC" myfile.txt
How can I implements the equivalent in Java 8 using stream?
If you're willing to use a third party library and don't need parallelism, then jOOλ offers SQL-style window functions as follows
Seq.seq(Files.readAllLines(Paths.get(new File("/path/to/Example.java").toURI())))
.window(-1, 1)
.filter(w -> w.value().contains("ABC"))
.forEach(w -> {
System.out.println("-1:" + w.lag().orElse(""));
System.out.println(" 0:" + w.value());
System.out.println("+1:" + w.lead().orElse(""));
// ABC: Just checking
});
Yielding
-1: .window(-1, 1)
0: .filter(w -> w.value().contains("ABC"))
+1: .forEach(w -> {
-1: System.out.println("+1:" + w.lead().orElse(""));
0: // ABC: Just checking
+1: });
The lead() function accesses the next value in traversal order from the window, the lag() function accesses the previous row.
Disclaimer: I work for the company behind jOOλ
Such scenario is not well-supported by Stream API as existing methods do not provide an access to the element neighbors in the stream. The closest solution which I can think up without creating custom iterators/spliterators and third-party library calls is to read the input file into List and then use indices Stream:
List<String> input = Files.readAllLines(Paths.get(fileName));
Predicate<String> pred = str -> str.contains("ABC");
int contextLength = 10;
IntStream.range(0, input.size()) // line numbers
// filter them leaving only numbers of lines satisfying the predicate
.filter(idx -> pred.test(input.get(idx)))
// add nearby numbers
.flatMap(idx -> IntStream.rangeClosed(idx-contextLength, idx+contextLength))
// remove numbers which are out of the input range
.filter(idx -> idx >= 0 && idx < input.size())
// sort numbers and remove duplicates
.distinct().sorted()
// map to the lines themselves
.mapToObj(input::get)
// output
.forEachOrdered(System.out::println);
The grep output also includes special delimiter like "--" to designate the omitted lines. If you want to go further and mimic such behavior as well, I can suggest you to try my free StreamEx library as it has intervalMap method which is helpful in this case:
// Same as IntStream.range(...).filter(...) steps above
IntStreamEx.ofIndices(input, pred)
// same as above
.flatMap(idx -> IntStream.rangeClosed(idx-contextLength, idx+contextLength))
// remove numbers which are out of the input range
.atLeast(0).less(input.size())
// sort numbers and remove duplicates
.distinct().sorted()
.boxed()
// merge adjacent numbers into single interval and map them to subList
.intervalMap((i, j) -> (j - i) == 1, (i, j) -> input.subList(i, j + 1))
// flatten all subLists prepending them with "--"
.flatMap(list -> StreamEx.of(list).prepend("--"))
// skipping first "--"
.skip(1)
.forEachOrdered(System.out::println);
As Tagir Valeev noted, this kind of problem isn't well supported by the streams API. If you incrementally want to read lines from the input and print out matching lines with context, you'd have to introduce a stateful pipeline stage (or a custom collector or spliterator) which adds quite a bit of complexity.
If you're willing to read all the lines into memory, it turns out that BitSet is a useful representation for manipulating groups of matches. This bears some similarity to Tagir's solution, but instead of using integer ranges to represent lines to be printed, it uses 1-bits in a BitSet. Some advantages of BitSet are that it has a number of built-in bulk operations, and it has a compact internal representation. It can also produce a stream of indexes of the 1-bits, which is quite useful for this problem.
First, let's start out by creating a BitSet that has a 1-bit for each line that matches the predicate:
void contextMatch(Predicate<String> pred, int before, int after, List<String> input) {
int len = input.size();
BitSet matches = IntStream.range(0, len)
.filter(i -> pred.test(input.get(i)))
.collect(BitSet::new, BitSet::set, BitSet::or);
Now that we have the bit set of matching lines, we stream out the indexes of each 1-bit. We then set the bits in the bitset that represent the before and after context. This gives us a single BitSet whose 1-bits represent all of the lines to be printed, including context lines.
BitSet context = matches.stream()
.collect(BitSet::new,
(bs,i) -> bs.set(Math.max(0, i - before), Math.min(i + after + 1, len)),
BitSet::or);
If we just want to print out all the lines, including context, we can do this:
context.stream()
.forEachOrdered(i -> System.out.println(input.get(i)));
The actual grep -A a -B b command prints a separator between each group of context lines. To figure out when to print a separator, we look at each 1-bit in the context bit set. If there's a 0-bit preceding it, or if it's at the very beginning, we set a bit in the result. This gives us a 1-bit at the beginning of each group of context lines:
BitSet separators = context.stream()
.filter(i -> i == 0 || !context.get(i-1))
.collect(BitSet::new, BitSet::set, BitSet::or);
We don't want to print the separator before each group of context lines; we want to print it between each group. That means we have to clear the first 1-bit (if any):
// clear the first bit
int first = separators.nextSetBit(0);
if (first >= 0) {
separators.clear(first);
}
Now, we can print out the result lines. But before printing each line, we check to see if we should print a separator first:
context.stream()
.forEachOrdered(i -> {
if (separators.get(i)) {
System.out.println("--");
}
System.out.println(input.get(i));
});
}