JAVA 8 streams and random access binary files [duplicate]

JAVA 8 streams and random access binary files [duplicate] - java

I have an interface which returns java.lang.Iterable<T>.
I would like to manipulate that result using the Java 8 Stream API.
However Iterable can't "stream".
Any idea how to use the Iterable as a Stream without converting it to List?

There's a much better answer than using spliteratorUnknownSize directly, which is both easier and gets a better result. Iterable has a spliterator() method, so you should just use that to get your spliterator. In the worst case, it's the same code (the default implementation uses spliteratorUnknownSize), but in the more common case, where your Iterable is already a collection, you'll get a better spliterator, and therefore better stream performance (maybe even good parallelism). It's also less code:
StreamSupport.stream(iterable.spliterator(), false)
.filter(...)
.moreStreamOps(...);
As you can see, getting a stream from an Iterable (see also this question) is not very painful.

If you can use Guava library, since version 21, you can use
Streams.stream(iterable)

You can easily create a Stream out of an Iterable or Iterator:
public static <T> Stream<T> stream(Iterable<T> iterable) {
return StreamSupport.stream(
Spliterators.spliteratorUnknownSize(
iterable.iterator(),
Spliterator.ORDERED
),
false
);
}

I would like to suggest using JOOL library, it hides spliterator magic behind the Seq.seq(iterable) call and also provides a whole bunch of additional useful functionality.

So as another answer mentioned Guava has support for this by using:
Streams.stream(iterable);
I want to highlight that the implementation does something slightly different than other answers suggested. If the Iterable is of type Collection they cast it.
public static <T> Stream<T> stream(Iterable<T> iterable) {
return (iterable instanceof Collection)
? ((Collection<T>) iterable).stream()
: StreamSupport.stream(iterable.spliterator(), false);
}
public static <T> Stream<T> stream(Iterator<T> iterator) {
return StreamSupport.stream(
Spliterators.spliteratorUnknownSize(iterator, 0),
false
);
}

I've created this class:
public class Streams {
/**
* Converts Iterable to stream
*/
public static <T> Stream<T> streamOf(final Iterable<T> iterable) {
return toStream(iterable, false);
}
/**
* Converts Iterable to parallel stream
*/
public static <T> Stream<T> parallelStreamOf(final Iterable<T> iterable) {
return toStream(iterable, true);
}
private static <T> Stream<T> toStream(final Iterable<T> iterable, final boolean isParallel) {
return StreamSupport.stream(iterable.spliterator(), isParallel);
}
}
I think it's perfectly readable because you don't have to think about spliterators and booleans (isParallel).

A very simple work-around for this issue is to create a Streamable<T> interface extending Iterable<T> that holds a default <T> stream() method.
interface Streamable<T> extends Iterable<T> {
default Stream<T> stream() {
return StreamSupport.stream(spliterator(), false);
}
}
Now any of your Iterable<T>s can be trivially made streamable just by declaring them implements Streamable<T> instead of Iterable<T>.

If you happen to use Vavr(formerly known as Javaslang), this can be as easy as:
Iterable i = //...
Stream.ofAll(i);

Related

Is there a standard way to turn a Kotlin sequence into a java.util.Enumeration?

They seem conceptually very similar. I wrote this function to solve the problem. But does anything exist in the standard library or elsewhere?
fun <T> Sequence<T>.toEnumeration(): Enumeration<T> {
val iterator = this.iterator()
return object : Enumeration<T> {
override fun hasMoreElements() = iterator.hasNext()
override fun nextElement(): T = iterator.next()
}
}

No, it's not available in the Standard Library.
However, Sequence has the iterator() and asIterable() methods.
An Iterator is functionally equivalent to an Enumeration and the preferred way of iterating a collection since Java 1.2

Convert Stream to IntStream

I have a feeling I'm missing something here. I found myself doing the following
private static int getHighestValue(Map<Character, Integer> countMap) {
return countMap.values().stream().mapToInt(Integer::intValue).max().getAsInt();
}
My problem is with the silly conversion from Stream to IntStream via the mapToInt(Integer::intValue)
Is there a better way of doing the conversion? all this is to avoid using max() from Stream, which requires passing a Comparator but the question is specifically on the convertion of Stream to IntStream

Due to type erasure, the Stream implementation has no knowledge about the type of its elements and can’t provide you with neither, a simplified max operation nor a conversion to IntStream method.
In both cases it requires a function, a Comparator or a ToIntFunction, respectively, to perform the operation using the unknown reference type of the Stream’s elements.
The simplest form for the operation you want to perform is
return countMap.values().stream().max(Comparator.naturalOrder()).get();
given the fact that the natural order comparator is implemented as a singleton. So it’s the only comparator which offers the chance of being recognized by the Stream implementation if there is any optimization regarding Comparable elements. If there’s no such optimization, it will still be the variant with the lowest memory footprint due to its singleton nature.
If you insist on doing a conversion of the Stream to an IntStream there is no way around providing a ToIntFunction and there is no predefined singleton for a Number::intValue kind of function, so using Integer::intValue is already the best choice. You could write i->i instead, which is shorter but just hiding the unboxing operation then.

I realize you are trying to avoid a comparator, but you could use the built-in for this by referring to Integer.compareTo:
private static int getHighestValue(Map<Character, Integer> countMap) {
return countMap.values().stream().max(Integer::compareTo).get();
}
Or as #fge suggests, using ::compare:
private static int getHighestValue(Map<Character, Integer> countMap) {
return countMap.values().stream().max(Integer::compare).get();
}

Another way you could do the conversion is with a lambda: mapToInt(i -> i).
Whether you should use a lambda or a method reference is discussed in detail here, but the summary is that you should use whichever you find more readable.

If the question is "Can I avoid passing converter while converting from Stream<T> to IntStream?" one possible answer might be "There is no way in Java to make such conversion type-safe and make it part of the Stream interface at the same time".
Indeed method which converts Stream<T> to IntStream without a converter might be looked like this:
public interface Stream<T> {
// other methods
default IntStream mapToInt() {
Stream<Integer> intStream = (Stream<Integer>)this;
return intStream.mapToInt(Integer::intValue);
}
}
So it suppose to be called on Stream<Integer> and will fail on other types of streams. But because streams are lazy evaluated and because of the type erasure (remember that Stream<T> is generic) code will fail at the place where stream is consumed which might be far from the mapToInt() call. And it will fail in a way that is extremely difficult to locate source of the problem.
Suppose you have code:
public class IntStreamTest {
public static void main(String[] args) {
IntStream intStream = produceIntStream();
consumeIntStream(intStream);
}
private static IntStream produceIntStream() {
Stream<String> stream = Arrays.asList("1", "2", "3").stream();
return mapToInt(stream);
}
public static <T> IntStream mapToInt(Stream<T> stream) {
Stream<Integer> intStream = (Stream<Integer>)stream;
return intStream.mapToInt(Integer::intValue);
}
private static void consumeIntStream(IntStream intStream) {
intStream.filter(i -> i >= 2)
.forEach(System.out::println);
}
}
It will fail on consumeIntStream() call with:
Exception in thread "main" java.lang.ClassCastException: java.lang.String cannot be cast to java.lang.Integer
at java.util.stream.ReferencePipeline$4$1.accept(ReferencePipeline.java:210)
at java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:948)
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:151)
at java.util.stream.ForEachOps$ForEachOp$OfInt.evaluateSequential(ForEachOps.java:189)
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
at java.util.stream.IntPipeline.forEach(IntPipeline.java:404)
at streams.IntStreamTest.consumeIntStream(IntStreamTest.java:25)
at streams.IntStreamTest.main(IntStreamTest.java:10)
Having this stacktrace do you able to quickly identify that the problem is in produceIntStream() because mapToInt() was called on the stream of the wrong type?
Of course one can write converting method which is type safe because it accepts concrete Stream<Integer>:
public static IntStream mapToInt(Stream<Integer> stream) {
return stream.mapToInt(Integer::intValue);
}
// usage
IntStream intStream = mapToInt(Arrays.asList(1, 2, 3).stream())
but it's not very convenient because it breaks fluent interface nature of the streams.
BTW:
Kotlin's extension functions allow to call some code as it is a part of the class' interface. So you are able to call this type-safe method as a Stream<java.lang.Integer>'s method:
// "adds" mapToInt() to Stream<java.lang.Integer>
fun Stream<java.lang.Integer>.mapToInt(): IntStream {
return this.mapToInt { it.toInt() }
}
#Test
fun test() {
Arrays.asList<java.lang.Integer>(java.lang.Integer(1), java.lang.Integer(2))
.stream()
.mapToInt()
.forEach { println(it) }
}

Iterate an Enumeration in Java 8

Is it possible to iterate an Enumeration by using Lambda Expression? What will be the Lambda representation of the following code snippet:
Enumeration<NetworkInterface> nets = NetworkInterface.getNetworkInterfaces();
while (nets.hasMoreElements()) {
NetworkInterface networkInterface = nets.nextElement();
}
I didn't find any stream within it.

(This answer shows one of many options. Just because is has had acceptance mark, doesn't mean it is the best one. I suggest reading other answers and picking one depending on situation you are in. IMO:
for Java 8 Holger's answer is nicest, because aside from being simple it doesn't require additional iteration which happens in my solution.
for Java 9 I would pick solution describe in Tagir Valeev answer)
You can copy elements from your Enumeration to ArrayList with Collections.list and then use it like
Collections.list(yourEnumeration).forEach(yourAction);

If there are a lot of Enumerations in your code, I recommend creating a static helper method, that converts an Enumeration into a Stream. The static method might look as follows:
public static <T> Stream<T> enumerationAsStream(Enumeration<T> e) {
return StreamSupport.stream(
Spliterators.spliteratorUnknownSize(
new Iterator<T>() {
public T next() {
return e.nextElement();
}
public boolean hasNext() {
return e.hasMoreElements();
}
},
Spliterator.ORDERED), false);
}
Use the method with a static import. In contrast to Holger's solution, you can benefit from the different stream operations, which might make the existing code even simpler. Here is an example:
Map<...> map = enumerationAsStream(enumeration)
.filter(Objects::nonNull)
.collect(groupingBy(...));

Since Java-9 there will be new default method Enumeration.asIterator() which will make pure Java solution simpler:
nets.asIterator().forEachRemaining(iface -> { ... });

In case you don’t like the fact that Collections.list(Enumeration) copies the entire contents into a (temporary) list before the iteration starts, you can help yourself out with a simple utility method:
public static <T> void forEachRemaining(Enumeration<T> e, Consumer<? super T> c) {
while(e.hasMoreElements()) c.accept(e.nextElement());
}
Then you can simply do forEachRemaining(enumeration, lambda-expression); (mind the import static feature)…

You can use the following combination of standard functions:
StreamSupport.stream(Spliterators.spliteratorUnknownSize(CollectionUtils.toIterator(enumeration), Spliterator.IMMUTABLE), parallel)
You may also add more characteristics like NONNULL or DISTINCT.
After applying static imports this will become more readable:
stream(spliteratorUnknownSize(toIterator(enumeration), IMMUTABLE), false)
now you have a standard Java 8 Stream to be used in any way! You may pass true for parallel processing.
To convert from Enumeration to Iterator use any of:
CollectionUtils.toIterator() from Spring 3.2 or you can use
IteratorUtils.asIterator() from Apache Commons Collections 3.2
Iterators.forEnumeration() from Google Guava

For Java 8 the simplest transformation of enumeration to stream is:
Collections.list(NetworkInterface.getNetworkInterfaces()).stream()

I know this is an old question but I wanted to present an alternative to Collections.asList and Stream functionality. Since the question is titled "Iterate an Enumeration", I recognize sometimes you want to use a lambda expression but an enhanced for loop may be preferable as the enumerated object may throw an exception and the for loop is easier to encapsulate in a larger try-catch code segment (lambdas require declared exceptions to be caught within the lambda). To that end, here is using a lambda to create an Iterable which is usable in a for loop and does not preload the enumeration:
/**
* Creates lazy Iterable for Enumeration
*
* #param <T> Class being iterated
* #param e Enumeration as base for Iterator
* #return Iterable wrapping Enumeration
*/
public static <T> Iterable<T> enumerationIterable(Enumeration<T> e)
{
return () -> new Iterator<T>()
{
#Override
public T next()
{
return e.nextElement();
}
#Override
public boolean hasNext()
{
return e.hasMoreElements();
}
};
}

Which is good practice - Modifying a List in the method, or returning a new List in the method?

Example code:
modifyMyList(myList);
public void modifyMyList(List someList){
someList.add(someObject);
}
or:
List myList = modifyMyList(myList);
public List modifyMyList(List someList){
someList.add(someObject)
return someList;
}
There is also a 3rd option I believe: You can create a new List in modifyMyList method and return this new List...
( 3rd option is here, I was too lazy but someone already added it in the answers: )
List myList = modifyMyList(myList);
public List modifyMyList(List someList){
List returnList = new ArrayList();
returnList.addAll(someList);
returnList.add(someObject);
return Collections.unmodifiableList(returnList);
}
Is there any reason why I should choose one over another? What should be considered in such case?

I have a (self imposed) rule which is "Never mutate a method parameter in a public method". So, in a private method, it's ok to mutate a parameter (I even try to avoid this case too). But when calling a public method, the parameters should never be mutated and should be considered immutable.
I think that mutating method arguments is a bit hacky and can lead to bugs that are harder to see.
I have been known to make exceptions to this rule but I need a really good reason.

Actually there is no functional difference.
You'll come to know the difference when you want the returned list
List someNewList = someInstnace.modifyMyList(list);

The second is probably confusing as it implies a new value is being created and returned - and it isn't.
An exception would be if the method was part of a 'fluent' API, where the method was an instance method and was modifying its instance, and then returning the instance to allow method chaining: the Java StringBuilder class is an example of this.
In general, however, I wouldn't use either.
I'd go for your third option: I write a method that creates and returns a new list with the appropriate change. This is a bit artificial in the case of your example, as the example is really just reproducing List.add(), but...
/** Creates a copy of the list, with val appended. */
public static <T> List<T> modifyMyList(List<T> list, T val) {
List<T> xs = new ArrayList<T>(list);
xs.add(val);
return xs;
}
Aside: I wouldn't, as suggested by Saket return an immutable list. His argument for immutability and parallelism is valid. But most of the time Java programmers expect to be able to modify a collection, except in special circumstances. By making you method return an immutable collection, you limit it's reusability to such circumstances. (The caller can always make the list immutable if they want to: they know the returned value is a copy and won't be touched by anything else.) Put another way: Java is not Clojure. Also, if parallelism is a concern, look at Java 8 and streams (the new kind - not I/O streams).
Here's a different example:
/** Returns a copy of a list sans-nulls. */
public static <T> List<T> compact(Iterable<T> it) {
List<T> xs = new ArrayList<T>();
for(T x : it)
if(x!=null) xs.add(x);
return xs;
}
Note that I've genercized the method and made it more widely applicable to taking an Iterable instead of a list. In real code, I'd have two overloaded versions, one taking an Iterable and one an Iterator. (The first would be implemented by calling the second, with the iterable's iterator.) Also, I've made it static as there was no reason for your method to be an instance method (it does not depend on state from the instance).
Sometimes, though, if I'm writing library code, and if it is not clear whether a mutating or non-mutating implementation is more generally useful, I create both. Here's a fuller example:
/** Returns a copy of the elements from an Iterable, as a List, sans-nulls. */
public static <T> List<T> compact(Iterable<T> it) {
return compact(it.iterator());
}
public static <T> List<T> compact(Iterator<T> iter) {
List<T> xs = new ArrayList<T>();
while(iter.hasNext()) {
T x = iter.next();
if(x!=null) xs.add(x);
}
return xs;
}
/** In-place, mutating version of compact(). */
public static <T> void compactIn(Iterable<T> it) {
// Note: for a 'fluent' version of this API, have this return 'it'.
compactIn(it.iterator());
}
public static <T> void compactIn(Iterator<T> iter) {
while(iter.hasNext()) {
T x = iter.next();
if(x==null) iter.remove();
}
}
If this was in a real API I'd check the arguments for null and throw IllegalArgumentException. (NOT NullPointerException - though it is often used for this purpose. NullPointerException happens for other reasons as well, e.g. buggy code. IllegalArgumentException is better for invalid parameters.)
(There'd also be more Javadoc than actual code too!)

The first and second solution are very similar, The advantage of the second is to permit chaining. The question of "is it a good practise" is subjected to debate as we can see here:
Method Chaining in Java
So the real question is between the first solution with mutable list and the third with a unmutable list, and again, there is not a unique response, it is the same debate between returning String, which are immutable and using Stringbuffer, which are mutable but permits better performance.
If you need reliablility of your API , and if you don't have performance issues use immutable (the third solution). Use it if your lists are always small.
If you need only performance use a mutable list (the first solution)

I will recommend creating a new list in the method and returning an immutable list. That way your code will work even when you are passed in an Immutable list. It is generally a good practice to create immutable objects as we generally move towards functional programming and try to scale across multiple processor architectures.
List myList = modifyMyList(myList);
public List modifyMyList(List someList){
List returnList = new ArrayList();
returnList.addAll(someList);
returnList.add(someObject);
return Collections.unmodifiableList(returnList);
}

As I said in my other answer, I don't think you should mutate the list parameter. But there are times where you also don't want to take a copy of the original list and mutate the copy.
The original list might be large so the copy is expensive
You want the copy to be kept up-to-date with any updates to the original list.
In these scenarios, you could create a MergedList which is a view over two (or perhaps more) lists
import java.util.*;
public class MergedList<T> extends AbstractList<T> {
private final List<T> list1;
private final List<T> list2;
public MergedList(List<T> list1, List<T> list2) {
this.list1 = list1;
this.list2 = list2;
}
#Override
public Iterator<T> iterator() {
return new Iterator<T>() {
Iterator<T> it1 = list1.iterator();
Iterator<T> it2 = list1.iterator();
#Override
public boolean hasNext() {
return it1.hasNext() || it2.hasNext();
}
#Override
public T next() {
return it1.hasNext() ? it1.next() : it2.next();
}
};
}
#Override
public T get(int index) {
int size1 = list1.size();
return index < size1 ? list1.get(index) : list2.get(index - size1);
}
#Override
public int size() {
return list1.size() + list2.size();
}
}
The you could do
public List<String> modifyMyList(List<String> someList){
return new MergedList(someList, List.of("foo", "bar", "baz"));
}

Both ways will work because in this case java works with the reference of the List but i prefer the secound way because this solution works for pass by value too, not only for pass by reference.

Functionally both are same.
However when you expose your method as an API, second method may give an impression that it returns a new modified list other than the original passed list.
While the first method would make it clear (of-course based on method naming convention) that it will modify the original list (Same object).
Also, the second method returns a list, so ideally the caller should check for a null return value even if the passed list is non null (The method can potentially return a null instead of modified list).
Considering this I generally prefer to use method one over second.

Automatically merge several collections to one

I have some Guava Functions like Function<String,Set<String>>. Using those with FluentIterable.transform() leads to a FluentIterable<Set<String>>, however I need a FluentIterable<String>. So my idea now would be to subclass FluentIterable<E> and add a new method transform2() which simply merges everything to one collection before returning it.
The original transform method looks like this:
public final <T> FluentIterable<T> transform(Function<? super E, T> function) {
return from(Iterables.transform(iterable, function));
}
I thought of something like this for my subclass and transform2() method:
public abstract class FluentIterable2<E> extends FluentIterable<E>
{
public final <T> FluentIterable<T> transform2(Function<? super E, Collection<T>> function) {
// (PROBLEM 1) Eclipse complains: The field FluentIterable<E>.iterable is not visible
Iterable<Collection<T>> iterables = Iterables.transform(iterable, function);
// (PROBLEM 2) Collection<T> merged = new Collection<T>(); // I need a container / collection - which one?
for(Collection<T> iterable : iterables)
{
// merged.addAll(iterable);
}
// return from(merged);
}
}
Currently I have two problems with my new subclass, marked above with PROBLEM 1 and PROBLEM 2
PROBLEM 1: The iterable field in the original FluentIterable class is private - what can I do about this? Can I create a new private field with the same name in my subclass, will this then be OK? What about methods in my subclass that call super.someMethod() which uses this field? Will they then use the field of the super class, which probably has a different value?
PROBLEM 2: I need some generic collection where I can combine the content of several collections, but collections is an interface, so I can't instantiate it. So, which class can I use there?
It would be acceptable if the solution only works with sets, though I'd prefer a solution that works with sets and lists.
Thanks for any hint on this!

Does FluentIterable.transformAndConcat(stringToSetFunction) not work for your use case?

Why subclass FluentIterable just to do this? You just need a simple loop:
Set<String> union = Sets.newHashSet();
for (Set<String> set : fluentIterableOfSets) {
union.addAll(set);
}

Use FluentIterable.transformAndConcat(f), where f is a Function mapping an element to some kind of iterable over the element type.
In your case, let's say your Function<String, Set<String>> is called TOKENIZE, and your initial Iterable<String> is called LINES.
Then to get a Set<String> holding all the distinct tokens in LINES, do this:
Iterable<String> LINES = ...;
Function<String, Set<String>> TOKENIZE = ...;
Set<String> TOKENS = FluentIterable.from(LINES)
.transformAndConcat(TOKENIZE)
.toSet();
But consider JB Nizet's answer carefully. Try it both ways and see which works better.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

JAVA 8 streams and random access binary files [duplicate] - java

I have an interface which returns java.lang.Iterable<T>. I would like to manipulate that result using the Java 8 Stream API. However Iterable can't "stream". Any idea how to use the Iterable as a Stream without converting it to List?

If you can use Guava library, since version 21, you can use Streams.stream(iterable)

You can easily create a Stream out of an Iterable or Iterator: public static <T> Stream<T> stream(Iterable<T> iterable) { return StreamSupport.stream( Spliterators.spliteratorUnknownSize( iterable.iterator(), Spliterator.ORDERED ), false ); }

I would like to suggest using JOOL library, it hides spliterator magic behind the Seq.seq(iterable) call and also provides a whole bunch of additional useful functionality.

If you happen to use Vavr(formerly known as Javaslang), this can be as easy as: Iterable i = //... Stream.ofAll(i);

Related

Is there a standard way to turn a Kotlin sequence into a java.util.Enumeration?

Convert Stream to IntStream

Iterate an Enumeration in Java 8

Which is good practice - Modifying a List in the method, or returning a new List in the method?

Automatically merge several collections to one

Categories

Resources