I'm learning how to use stream, and I get a problem with this method.
public static String[] inArray(String[] array1, String[] array2) {
return Arrays.stream(array1)
.filter(str -> Arrays.stream(array2).anyMatch(s -> s.contains(str)))
.distinct().sorted().toArray(**String[]::new**);
}
I'm so confused about String[]::new, could you give me a hint?
String[]::new means size -> new String[size].
When Stream#toArray(IntFunction<A[]> generator) is ready to produce an array, it calls generator and passes (generator.apply) the size of the inner collection to get a collection to fill it up.
I would say the existing answers provide some insight but none of them yet talk about IntFunction<R>.
To add to them explain, what it means in the context of Stream.toArray(String[]::new) is that it represents an IntFunction implementation such as :
new IntFunction<String[]>() {
#Override
public String[] apply(int value) {
return new String[value];
}
}
where the code creates a newly allocated String[] of size value and produces the array of that size as an output.
You are right to be confused, because Java isn't really super clear about types vs. classes.
We know that String[] is a type, as you can declare variables of that type:
jshell> String[] s = new String[]{"Hello", "world"}
s ==> String[2] { "Hello", "world" }
However, String[] actually is treated as a class in Java and not just a type:
jshell> s.getClass()
$2 ==> class [Ljava.lang.String;
That funny looking [Ljava.lang.String, representing the type "array of string" shows up in response to the getClass invocation. I agree that it is surprising. But every object in Java has to have a class, and String[] is that class. (In other languages, you might see something like Array<String> which might be a dash clearer. But then Java has type erasure so again, things look a little confusing.)
In your particular case, here's what's going on. You need to be careful with types when making arrays from streams. Naively, you might get:
jshell> Arrays.asList("a", "b").stream().toArray()
$5 ==> Object[2] { "a", "b" }
So we want the version of toArray that gives us an array:
jshell> Arrays.asList("a", "b").stream().toArray((n) -> new String[n])
$7 ==> String[2] { "a", "b" }
That's better! The result type is an array of strings, instead of just an array of obejcts. Now the (n)->new String[n] can be replaced with a method reference for construction. Java allows array types in method references! So we can write:
jshell> Arrays.asList("a", "b").stream().toArray(String[]::new)
$8 ==> String[2] { "a", "b" }
Aside: There are some caveats when using array types in method references like this, such as the requirement that the array type must be reifiable, but I think that's a little beyond what you might have been asking. The TL;DR here is that, by design, Java allows array types in (constructor-like) method references with ::new.
This is a method reference expression see JLS 15.13. The syntax for method references is:
MethodReference:
ExpressionName :: [TypeArguments] Identifier
Primary :: [TypeArguments] Identifier
ReferenceType :: [TypeArguments] Identifier
super :: [TypeArguments] Identifier
TypeName . super :: [TypeArguments] Identifier
ClassType :: [TypeArguments] new
ArrayType :: new
The particular case you are looking at is the last one. In your example, String[] is an ArrayType which means that it consists of a type name followed by one or more [].
There shouldn't be a class named String[] which is very lame and I could not interpret what it is actually meant for.
See above: it is a type specification not a class name. From a syntactic / linguistic perspective, this usage is analogous to:
Class<?> c = String[].class;
or
if (a instanceof String[])
or even
public void myMethod(String[] arg)
(You wouldn't call those "lame" ... would you?)
Now you could have a valid case for saying that it is syntactically unexpected (especially to a pre-Java 8 programmer) to be able to use the new keyword like this. But this unexpected syntax is a consequence of the strong imperative that the designers have to NOT break backwards compatibility when adding new language features to Java. And it is not unintuitive. (At least, I don't think so. When I first saw this construct, is was obvious to me what it meant.)
Now, if they were starting with a clean slate in 2018, a lot of details of the Java language design would be simpler and cleaner. But they don't have the luxury of doing that.
The documentation of Stream#toArray says it exactly:
The generator function takes an integer, which is the size of the desired array, and produces an array of the desired size.
for example:
IntFunction<int[]> factory = int[]::new;
// v--- once `apply(3)` is invoked,it delegates to `new int[3]`
int [] array = factory.apply(3);
// ^--- [0, 0, 0] create an int array with size 3
String[]::new is a method reference expression and it must be assigned/casted to a certain functional interface type at compile time:
A method reference expression is used to refer to the invocation of a method without actually performing the invocation. Certain forms of method reference expression also allow class instance creation (§15.9) or array creation (§15.10) to be treated as if it were a method invocation.
A method reference expression is compatible in an assignment context, invocation context, or casting context with a target type T if T is a functional interface type (§9.8) and the expression is congruent with the function type of the ground target type derived from T.
Edit
As #Eugene mentioned in comments below. It's necessary to let you know how and where the stream create an fixed size array to collecting all elements.
The following table is showing the stream how to calculates the array size:
sequential stream - AbstractSpinedBuffer#count
parallel stream
stateless OPs with known/fixed size Spliterator - AbstractConcNode#AbstractConcNode
stateful OPs
fixed size Spliterator - Spliterator#estimateSize
unknown size Spliterator - AbstractConcNode#AbstractConcNode
The following table is showing the stream where to creates a fixed size array by array generator IntFunction:
sequential stream
stateful/stateless OPs with unknown/fixed size Spliterator - SpinedBuffer#asArray
parallel stream
stateless OPs with known/fixed size Spliterator - Nodes#flatten
stateful OPs
fixed size Spliterator - Nodes#collect
unknown size Spliterator - Nodes#flatten
String[]::new
This is lambda for the following method:
public String[] create(int size) {
return new String[size];
}
Your whole stream operation is terminating converting that into an array, that is what you do with the last method toArray(), but an array of what?....
of Strings ( thus String[]::new)
The parameter of toArray(...) is a Functional Interface (namely IntFunction<R> and then String[]::new is defined as the Method Reference or in that case constructor to use that generates an array of the desired type.
See https://docs.oracle.com/javase/8/docs/api/java/lang/FunctionalInterface.html
And https://docs.oracle.com/javase/tutorial/java/javaOO/methodreferences.html
Adding to the answer of Andrew Tobilko:
"String[]::new means size -> new String[size]"
which, since toArray takes an IntFunction, is similar to:
IntFunction<String[]> generator = new IntFunction<String[]>() {
#Override
public String[] apply(int size) {
return new String[size];
}
};
To convert your stream to another List, you can use:
.collect(Collectors.toList());
Related
Java 9 comes with convenience factory methods for creating immutable lists. Finally a list creation is as simple as:
List<String> list = List.of("foo", "bar");
But there are 12 overloaded versions of this method, 11 with 0 to 10 elements, and one with var args.
static <E> List<E> of(E... elements)
Same is the case with Set and Map.
Since there is a var args method, what is the point of having extra 11 methods?
What I think is that var-args create an array, so the other 11 methods can skip creation of an extra object and in most cases 0 - 10 elements will do. Is there any other reason for this?
From the JEP docs itself -
Description -
These will include varargs overloads, so that there is no fixed limit
on the collection size. However, the collection instances so created
may be tuned for smaller sizes. Special-case APIs (fixed-argument
overloads) for up to ten of elements will be provided. While this
introduces some clutter in the API, it avoids array allocation,
initialization, and garbage collection overhead that is incurred by
varargs calls. Significantly, the source code of the call site is the same regardless of whether a fixed-arg or varargs overload is called.
Edit - To add motivation and as already mentioned in the comments by #CKing too :
Non-Goals -
It is not a goal to support high-performance, scalable collections
with arbitrary numbers of elements. The focus is on small collections.
Motivation -
Creating a small, unmodifiable collection (say, a set) involves constructing it, storing it in a local variable, and invoking add() on it several times, and then wrapping it.
Set<String> set = Collections.unmodifiableSet(new HashSet<>(Arrays.asList("a", "b", "c")));
The Java 8 Stream API can be used to construct small collections, by combining stream factory methods and collectors.
// Java 8
Set<String> set1 = Collections.unmodifiableSet(Stream.of("a", "b", "c").collect(Collectors.toSet()));
Much of the benefit of collection literals can be gained by providing library APIs for creating small collection instances, at significantly reduced cost and risk compared to changing the language. For example, the code to create a small Set instance might look like this:
// Java 9
Set set2 = Set.of("a", "b", "c");
You may find the following passage of item 42 of Josh Bloch's Effective Java (2nd ed.) enlightening:
Every invocation of a varargs method causes an array allocation and initialization. If you have determined empirically that you can’t afford this cost but you need the flexibility of varargs, there is a pattern that lets you have your cake and eat it too. Suppose you’ve determined that 95 percent of the calls to a method have three or fewer parameters. Then declare five overloadings of the method, one each with zero through three ordinary parameters, and a single varargs method for use when the number of arguments exceeds three [...]
As you suspected, this is a performance enhancement. Vararg methods create an array "under the hood", and having method which take 1-10 arguments directly avoids this redundant array creation.
You can also look at it the other way around. Since varargs methods can accept arrays, such a method would serve as an alternative means to convert an array to a List.
String []strArr = new String[]{"1","2"};
List<String> list = List.of(strArr);
The alternative to this approach is to use Arrays.asList but any changes made to the List in this case would reflect in the array which is not the case with List.of. You can therefore use List.of when you don't want the List and the array to be in sync.
Note The justification given in the spec seems like a micro-optimzation to me. (This has now been confirmed by the owner of the API himself in the comments to another answer)
This pattern is used for optimization of methods which accept varargs parameters.
If you can figure out that the most time you're using only couple of them, you probably would like to define a method overloadings with the amount of most used parameters:
public void foo(int num1);
public void foo(int num1, int num2);
public void foo(int num1, int num2, int num3);
public void foo(int... nums);
This will help you to avoid array creation while calling varargs method. The pattern used for performance optimization:
List<String> list = List.of("foo", "bar");
// Delegates call here
static <E> List<E> of(E e1, E e2) {
return new ImmutableCollections.List2<>(e1, e2); // Constructor with 2 parameters, varargs avoided!
}
More interesting thing behind this is that starting from 3 parameters we are delegating to varargs constructor again:
static <E> List<E> of(E e1, E e2, E e3) {
return new ImmutableCollections.ListN<>(e1, e2, e3); // varargs constructor
}
This seems strange for now, but as I may guess - this is reserved for future improvements and as an option, potential overloading of all constructors List3(3 params), List7(7 params)... and etc.
As per Java doc: The collections returned by the convenience factory methods are more space efficient than their mutable equivalents.
Before Java 9:
Set<String> set = new HashSet<>(3); // 3 buckets
set.add("Hello");
set.add("World");
set = Collections.unmodifiableSet(set);
In above implementation of Set, there are 6 objects are creating : the unmodifiable wrapper; the HashSet, which contains a HashMap; the table of buckets (an array); and two Node instances (one for each element). If a VM take 12-byte per object then there are 72 bytes are consuming as overhead, plus 28*2 = 56 bytes for 2 elements. Here the large amount is consumed by overhead as compared to the data stored in collection. But in Java 9 this overhead is very less.
After Java 9:
Set<String> set = Set.of("Hello", "World");
In above implementation of Set, only one object is creating and this will take very less space to hold data due to minimal overhead.
I can't find anywhere if there's some kind of collection where i can get items with:
SpecialArray specialArray = new SpecialArray()
specialArray.put("first", someValue);
specialArray.put("second", otherValue);
//and then:
object obj = specialArray["first"];
//or:
specialArray["second"] = anotherValue;
It's a little like HashMap<String, object> but in a HashMap i can only
get a value with map.get(String) update a value with map.put(String, object)
No, this syntax is not available in Java. You can get the functionality from Map using a different syntax.
If this is not enough, use another language.
Subscripting using <array-reference-expression> [ <expression> ] in Java is only available for genuine array objects. As JLS 15.10.3 says:
"The type of the array reference expression must be an array type (call it T[], an array whose components are of type T), or a compile-time error occurs.
That's it. No exceptions, workarounds or clever hacks.
I am quite puzzled about the Collectors.toList() and Collectors.toSet() static methods. These two methods do not take in any parameters. So how do they know what types of Collector to return?
For example, if we have this line:
Collectors.toList();
The returned Collector is Collector<Object,?,List<Object>>.
If we have this line:
Collector<Integer,?,List<Integer>> c = Collectors.toList();
Then Collectors.toList() will return a Collector<Integer,?,List<Integer>>. Without taking in any input parameters, how does the toList() method know that it needs to return a Collector<Integer,?,List<Integer>>?
Perhaps a sample codes of how toList() is written would be helpful in my understanding.
Thanks in advance.
This feature is introduced as target type in generic type inference.
The Java compiler takes advantage of target typing to infer the type parameters of a generic method invocation. The target type of an expression is the data type that the Java compiler expects depending on where the expression appears.
For example:
// v--- the generic parameter `T` is inferred by the target type
Collector<Integer,?,List<Integer>> c = Collectors.toList();
// v--- the unbounded type parameter is extends `Object`
Collectors.toList();
Suppose you wanted to collect a list of even integers into another list.
You can write this:
// assume that integerList contains ints between 1 and 20
List<Integer> evenInts = integerList.stream().filter(x -> x % 2 == 0)
.collect(Collectors.toList());
collect is a terminal operation in the stream, and it is expected to return a type bound by R, which in this case, translates to List<Integer>. This is how it's able to appropriately collect your elements in.
You're encouraged here to peruse more information about Collector, as it's part of the newer Java 8 Stream API and can be a bit curious to get into once you start.
In Java we may create IntFunction<String[]> from 1D array constructor reference:
// both do the same thing
IntFunction<String[]> createArrayL = size -> new String[size];
IntFunction<String[]> createArrayMR = String[]::new;
Now I wonder why we cannot do this with a 2D array:
BiFunction<Integer, Integer, String[][]> createArray2DL =
(rows, cols) -> new String[rows][cols];
// error:
BiFunction<Integer, Integer, String[][]> createArray2DMR =
String[][]::new;
Of course we may write:
IntFunction<String[][]> createArray2DInvalidL = String[][]::new;
System.out.println(createArray2DInvalidL.apply(3)[0]); // prints null
but this will behave differently than:
new String[3][3]
because row arrays will not be initialized.
So my question is: why String[][]::new doesn't work for 2D arrays (for me it looks like an inconsistency in language design)?
Quite an interesting case indeed.
The problem is that String[][]::new is a function with an arity of 1(it's a constructor of an array of arrays) and can't be treated as a BiFunction(arity of 2) and your example new String[3][3] has two parameters instead of one.
In this case,
createArray2DInvalidL.apply(3)
is equal to calling new String[3][];
What you might be looking for is:
IntFunction<String[][]> createArray2D = n -> new String[n][n];
Dimensions don't need to have equal lengths and it sounds like a pretty reasonable assumption.
http://4comprehension.com/multidimensional-arrays-vs-method-references/
There is no inconsistency here. If you write a statement like
IntFunction<ElementType[]> f = ElementType[]::new;
you create a function whose evaluation will return a new array with each entry being capable of holding a reference of ElementType, initialized to null. This doesn’t change, when you use String[] for ElementType.
But it also has been addressed explicitly in The Java Language Specification, §15.13.3. Run-Time Evaluation of Method References:
If the form is Type[]k :: new (k ≥ 1), then the body of the invocation method has the same effect as an array creation expression of the form new Type [ size ] []k-1, where size is the invocation method’s single parameter. (The notation []k indicates a sequence of k bracket pairs.)
There is no support for a rectangular multi-dimensional array creation method reference, most likely, because there is no actual use case that acted as a driving force. The one-dimensional array creation expression can be used together with Stream.toArray(…), allowing a more concise syntax than the equivalent lambda expression, despite there is no special support in the underlying architecture, i.e. int[]::new produces exactly the same compiled code as intArg -> new int[intArg]. There is no similar use case for a two (or even more) dimensional array creation, so there isn’t even a similar functional interface for a function consuming two or more int value and producing a reference type result.
Can you please let me know which version of java below flower bracket ({}) is introduced? what is concept name for this.
Object[] arg = {abc.getAbctNumber()};
here abc is object of java class and getAbcNumber() is a java method. I understand that arg object will be assigned with the value of return value of getAbcNumber() method.
{} is used to specify an array literal. So in your case you're specifying an array of objects with one element.
There is no such thing as a "flower bracket" in java. What you are seeing here, is an array being populated by a method.
You are creating an array with this syntax similar to:
int myarray[] = {1, 2, 3};
which will create an array of three ints. Your array will be created with an object.
This looks like a list initializer (not sure about the terminology, I don't do a lot of Java). In this case arg is an array of type Object and it's being initialized with a single value, which is the result of abc.getAbctNumber().
Consider an initializer with more than one value and it starts to become more clear:
Object[] arg = {
abc.getAbctNumber(),
abc.getSomeOtherNumber(),
abc.getSomethingElse()
};
That would initialize the arg array with three elements, the results of three different methods.
There is nothing called Flower bracket(at least I don't know about that). And in your Object[] arg = {abc.getAbctNumber()}; {} represent an array of one element and that element being an Object that is returned by method getAbctNumber()