I was playing with the following question: Using Java 8's Optional with Stream::flatMap and wanted to add a method to a custom Optional<T> and then check if it worked.
More precise, I wanted to add a stream() to my CustomOptional<T> that returns an empty stream if no value is present, or a stream with a single element if it is present.
However, I came to the conclusion that Optional<T> is declared as final.
Why is this so? There are loads of classes that are not declared as final, and I personally do not see a reason here to declare Optional<T> final.
As a second question, why can not all methods be final, if the worry is that they would be overridden, and leave the class non-final?
According to this page of the Java SE 8 API docs, Optional<T> is a value based class. According to this page of the API docs, value-based classes have to be immutable.
Declaring all the methods in Optional<T> as final will prevent the methods from being overridden, but that will not prevent an extending class from adding fields and methods. Extending the class and adding a field together with a method that changes the value of that field would make that subclass mutable and hence would allow the creation of a mutable Optional<T>. The following is an example of such a subclass that could be created if Optional<T> would not be declared final.
//Example created by #assylias
public class Sub<T> extends Optional<T> {
private T t;
public void set(T t) {
this.t = t;
}
}
Declaring Optional<T> final prevents the creation of subclasses like the one above and hence guarantees Optional<T> to be always immutable.
As others have stated Optional is a value based class and since it is a value based class it should be immutable which needs it to be final.
But we missed the point for this. One of the main reason why value based classes are immutable is to guarantee thread safety. Making it immutable makes it thread safe. Take for eg String or primitive wrappers like Integer or Float. They are declared final for similar reasons.
Probably, the reason is the same as why String is final; that is, so that all users of the Optional class can be assured that the methods on the instance they receive keep to their contract of always returning the same value.
Though we could not extend the Optional class, we could create our own wrapper class.
public final class Opt {
private Opt() {
}
public static final <T> Stream<T> filledOrEmpty(T t) {
return Optional.ofNullable(t).isPresent() ? Stream.of(t) : Stream.empty();
}
}
Hope it might helps you. Glad to see the reaction!
Related
I've been using lambdas and method references in Java 8 for a while and there is this one thing I do not understand. Here is the example code:
Set<Integer> first = Collections.singleton(1);
Set<Integer> second = Collections.singleton(2);
Set<Integer> third = Collections.singleton(3);
Stream.of(first, second, third)
.flatMap(Collection::stream)
.map(String::valueOf)
.forEach(System.out::println);
Stream.of(first, second, third)
.flatMap(Set::stream)
.map(String::valueOf)
.forEach(System.out::println);
The two stream pipelines do the same thing, they print out the three numbers, one per line. The difference is in their second line, it seems you can simply replace the class name in the inheritance hierarchy as long as it has the method (the Collection interface has the default method "stream", which is not redefined in the Set interface).
I tried out what happens if the method is redefined again and again, using these classes:
private static class CustomHashSet<E> extends HashSet<E> {
#Override
public Stream<E> stream() {
System.out.println("Changed method!");
return StreamSupport.stream(spliterator(), false);
}
}
private static class CustomCustomHashSet<E> extends CustomHashSet<E> {
#Override
public Stream<E> stream() {
System.out.println("Changed method again!");
return StreamSupport.stream(spliterator(), false);
}
}
After changing the first, second and third assignments to use these classes I could replace the method references (CustomCustomHashSet::stream) and not surprisingly they did print out the debugging messages in all cases, even when I used Collection::stream. It seems you cannot call the super, overriden method with method references.
Is there any runtime difference? What is the better practice, refer to the top level interface/class or use the concrete, known type (Set)?
Thanks!
Edit:
Just to be clear, I know about inheritance and LSP, my confusion is related to the design of the method references in Java 8. My first thought was that changing the class in a method reference would change the behavior, that it would invoke the super method from the chosen class, but as the tests showed, it makes no difference. Changing the created instance types does change the behavior.
Even method references have to respect to OOP principle of method overriding. Otherwise, code like
public static List<String> stringify(List<?> o) {
return o.stream().map(Object::toString).collect(Collectors.toList());
}
would not work as expected.
As to which class name to use for the method reference: I prefer to use the most general class or interface that declares the method.
The reason is this: you write your method to process a collection of Set. Later on you see that your method might also be useful for a collection of Collection, so you change your method signature accordingly. Now if your code within the method always references Set method, you will have to adjust these method references too:
From
public static <T> void test(Collection<Set<T>> data) {
data.stream().flatMap(Set::stream).forEach(e -> System.out.println(e));
}
to
public static <T> void test(Collection<Collection<T>> data) {
data.stream().flatMap(Collection::stream).forEach(e -> System.out.println(e));
}
you need to change the method body too, whereas if you had written your method as
public static <T> void test(Collection<Set<T>> data) {
data.stream().flatMap(Collection::stream).forEach(e -> System.out.println(e));
}
you will not have to change the method body.
A Set is a Collection. Collection has a stream() method, so Set has that same method too, as do all Set implementations (eg HashSet, TreeSet, etc).
Identifying the method as belonging to any particular supertype makes no difference, as it will always resolve to the actual method declared by the implementation of the object at runtime.
See the Liskov Substitution Principle:
if S is a subtype of T, then objects of type T may be replaced with objects of type S without altering any of the desirable properties of that program
Although it is possible to serialize a lambda in Java 8, it is strongly discouraged; even serializing inner classes is discouraged. The reason given is that lambdas may not deserialize properly on another JRE. However, doesn't this mean that there is a way to safely serialize a lambda?
For example, say I define a class to be something like this:
public class MyClass {
private String value;
private Predicate<String> validateValue;
public MyClass(String value, Predicate<String> validate) {
this.value = value;
this.validateValue = validate;
}
public void setValue(String value) {
if (!validateValue(value)) throw new IllegalArgumentException();
this.value = value;
}
public void setValidation(Predicate<String> validate) {
this.validateValue = validate;
}
}
If I declared an instance of the class like this, I should not serialize it:
MyClass obj = new MyClass("some value", (s) -> !s.isEmpty());
But what if I made an instance of the class like this:
// Could even be a static nested class
public class IsNonEmpty implements Predicate<String>, Serializable {
#Override
public boolean test(String s) {
return !s.isEmpty();
}
}
MyClass isThisSafeToSerialize = new MyClass("some string", new IsNonEmpty());
Would this now be safe to serialize? My instinct says that yes, it should be safe, since there's no reason that interfaces in java.util.function should be treated any differently from any other random interface. But I'm still wary.
It depends on which kind of safety you want. It’s not the case that serialized lambdas cannot be shared between different JREs. They have a well defined persistent representation, the SerializedLambda. When you study, how it works, you’ll find that it relies on the presence of the defining class, which will have a special method that reconstructs the lambda.
What makes it unreliable is the dependency to compiler specific artifacts, e.g. the synthetic target method, which has some generated name, so simple changes like the insertion of another lambda expression or recompiling the class with a different compiler can break the compatibility to existing serialized lambda expression.
However, using manually written classes isn’t immune to this. Without an explicitly declared serialVersionUID, the default algorithm will calculate an id by hashing class artifacts, including private and synthetic ones, adding a similar compiler dependency. So the minimum to do, if you want reliable persistent forms, is to declare an explicit serialVersionUID.
Or you turn to the most robust form possible:
public enum IsNonEmpty implements Predicate<String> {
INSTANCE;
#Override
public boolean test(String s) {
return !s.isEmpty();
}
}
Serializing this constant does not store any properties of the actual implementation, besides its class name (and the fact that it is an enum, of course) and a reference to the name of the constant. Upon deserialization, the actual unique instance of that name will be used.
Note that serializable lambda expressions may create security issues because they open an alternative way of getting hands on an object that allows to invoke the target methods. However, this applies to all serializable classes, as all variant shown in your question and this answer allow to deliberately deserialize an object allowing to invoke the encapsulated operation. But with explicit serializable classes, the author is usually more aware of this fact.
I have many instances of
Foo.a()
but now I want to split up calls to a() based on certain criteria. If possible I would like to keep the Foo.a() calls unchanged. Instead, perhaps Foo could become a factory that manages the flow and FooA and FooB could extend Foo. For example, in Foo:
private static Class<?> foo;
static {
if (certain_criteria) {
foo = SomeUtil.getClass("FooA");
} else {
foo = FooB.class;
}
Object obj = foo.newInstance();
o = (Foo) obj;
}
...
public static void a() {
o.a(); //And this should call either FooA.a() or FooB.a()
//But a() should be accessed in a static way
}
I can't make a() in Foo non-static because then I'll have to change the 100+ calls throughout the project to Foo.a(). Is there a way around this? Or a better way to handle the flow?
I also tried to use foo to call a(), but that gives a compiler error because it is of type Class?>. If I change it to
Class<Foo>
then I get
Type mismatch: cannot convert from Class<FooB> to Class<Foo>
You propose using static method Foo.a() as a facade over selecting and invoking an appropriate implementation, in a configurable manner chosen by class Foo. Your specific idea seems to rely on subclasses of Foo to implement the Strategy pattern for supporting Foo.a().
You are conflating at least two separable pieces to this:
the strategy for implementing Foo.a(), and
the mechanism by which a specific strategy is chosen and instantiated.
In particular, although you may have reason to want to use subclasses of Foo to represent your strategies in the real code, no such reason is apparent in your example code. Schematically, then, you seem to want something like this:
public class Foo {
private static FooStrategy strategy = FooStrategyFactory.createStrategy();
public static void a() {
strategy.doA();
}
}
interface FooStrategy {
void doA();
}
You don't need to go all the way there, of course. Your original idea was basically to let Foo itself serve in the place of FooStrategy, and to let a static initializer serve instead of a separate FooStrategyFactory. There's nothing inherently wrong with that; I just pull it apart to more clearly show what role each bit serves.
You also expressed some specific implementation issues:
If I change it to Class<Foo> then I get
Type mismatch: cannot convert from Class to Class
The equivalent in my scheme above would be declaring a variable of type Class<FooStrategy>, and attempting to assign to it a Class<FooStrategyA> representing a class that implements FooStrategy. The correct type for a Class object that may represent any class whose instances are assignment-compatible with type FooStrategy is Class<? extends FooStrategy>. That works whether FooStrategy itself is a class or an interface.
I can't call any classes from Foo on foo. "The method a() is undefined for the type Class"
You seem to have been saying that you could not invoke static methods of class Foo on an object of type Class<? extends Foo>. And indeed, you can't. Objects of class Class have only the methods of class Class. Although you can use them to reflectively invoke methods of the classes they represent, such methods are not accessible directly via the Class instance itself. That issue does not arise directly in the scheme I presented, but it could arise in the factory or strategy implementations.
Moreover, static methods are not virtual. They are bound at compile time, based on the formal type of the reference expressions on which they are invoked. In order to apply the strategy pattern correctly, the needed strategy implementation methods need to be virtual: non-private and non-static.
Consider the following example code
class MyClass {
public String var = "base";
public void printVar() {
System.out.println(var);
}
}
class MyDerivedClass extends MyClass {
public String var = "derived";
public void printVar() {
System.out.println(var);
}
}
public class Binding {
public static void main(String[] args) {
MyClass base = new MyClass();
MyClass derived = new MyDerivedClass();
System.out.println(base.var);
System.out.println(derived.var);
base.printVar();
derived.printVar();
}
}
it gives the following output
base
base
base
derived
Method calls are resolved at runtime and the correct overridden method is called, as expected.
The variables access is instead resolved at compile time as I later learned.
I was expecting an output as
base
derived
base
derived
because in the derived class the re-definition of var shadows the one in the base class.
Why does the binding of variables happens at compile time and not at runtime? Is this only for performance reasons?
The reason is explained in the Java Language Specification in an example in Section 15.11, quoted below:
...
The last line shows that, indeed, the field that is accessed does not depend on the run-time class of the referenced object; even if s holds a reference to an object of class T, the expression s.x refers to the x field of class S, because the type of the expression s is S. Objects of class T contain two fields named x, one for class T and one for its superclass S.
This lack of dynamic lookup for field accesses allows programs to be run efficiently with straightforward implementations. The power of late binding and overriding is available, but only when instance methods are used...
So yes performance is a reason. The specification of how the field access expression is evaluated is stated as follows:
If the field is not static:
...
If the field is a non-blank final, then the result is the value of the named member field in type T found in the object referenced by the value of the Primary.
where Primary in your case refers the variable derived which is of type MyClass.
Another reason, as #Clashsoft suggested, is that in subclasses, fields are not overriden, they are hidden. So it makes sense to allow which fields to access based on the declared type or using a cast. This is also true for static methods. This is why the field is determined based on the declared type. Unlike overriding by instance methods where it depends on the actual type. The JLS quote above indeed mentions this reason implicitly:
The power of late binding and overriding is available, but only when instance methods are used.
While you might be right about performance, there is another reason why fields are not dynamically dispatched: You wouldn't be able to access the MyClass.var field at all if you had a MyDerivedClass instance.
Generally, I don't know about any statically typed language that actually has dynamic variable resolution. But if you really need it, you can make getters or accessor methods (which should be done in most cases to avoid public fields, anyway):
class MyClass
{
private String var = "base";
public String getVar() // or simply 'var()'
{
return this.var;
}
}
class MyDerivedClass extends MyClass {
private String var = "derived";
#Override
public String getVar() {
return this.var;
}
}
The polymorphic behaviour of the java language works with methods and not member variables: they designed the language to bind member variables at compile time.
In java, this is by design.
Because, the set up of fields to be dynamically resolved would make things to run a bit slower. And in real, there's not any reason of doing so.
Since, you can make your fields in any class private and access them with methods which are dynamically resolved.
So, fields are made to resolved better at compile time instead :)
this and super are keywords aren't they; then how can I use them for passing arguments to constructors the same way as with a method??
In short how is it that both can show such distinct behaviors??
You are correct that both this and super are keywords. The Java language specification defines explicitly how they must behave. The short answer is that these keywords behave specially because the specification says that they must.
According to the specification this can be used a primary expression (only in certain places) or in an explicit constructor invocation.
The keyword this may be used only in the body of an instance method, instance initializer or constructor, or in the initializer of an instance variable of a class. If it appears anywhere else, a compile-time error occurs.
So you can use this as an argument to a function to pass a reference to the current object. However note that you cannot use super in the same way as it is not a primary expression:
public class Program
{
void test(Program p) {}
void run() { test(super); }
public static void main(String[] args)
{
new Program().run();
}
}
Result:
Program.java:5: '.' expected
void run() { test(super); }
You can use super.foo though because this is defined in 15.11 to be valid:
FieldAccess:
Primary . Identifier
super . Identifier
ClassName .super . Identifier
The specification also puts restrictions on how super can be used:
The special forms using the keyword super are valid only in an instance method, instance initializer or constructor, or in the initializer of an instance variable of a class; these are exactly the same situations in which the keyword this may be used (§15.8.3).
The Java language provides specific handling for these two keywords and they are allowed in limited contexts.
Invoking this(...) will result in bytecode that will invoke the corresponding constructor on the current class, while invoking super(...) will result in bytecode that will invoke the corresponding constructor on the supertype.
Java provides special handling for these because their binding is different from that of normal methods (i.e., you want to avoid dynamic invocation, or you would never manage to get the constructor on the supertype).
Every language has to deal with this problem. In C++, for example, you explicitly specify the name of the parent method instead of using super.
I'm not entirely sure what you're asking here, but this is the way the language is designed. The compiler knows that when you do this in a constructor:
super("I'm a string!", 32);
it should find a constructor in the superclass that takes a String and an int as parameters.
From your sub-class you can pass the variables provided to your parent. Examples are better then long explanations so here's a pretty generic example of Extending the Exception Class for your own usage:
public class MyException extends Exception {
public MyException()
{
}
public MyException(String message)
{
super(message);
}
public MyException(String string, Throwable e)
{
super(string, e);
}
}
how is it that these can show such
distinct behaviours
The question doesn't make sense. All keywords have distinct behaviours. That's what they're for.
I am not sure what is the doubt here. This this refers to the reference to the current instance, also when called like this(arg) it calls the corresponding constructor in the current class. Similarly, when super() is called it calls the corresponding constructor in the super class. They can be called only from a constructor.
According to Wikipedia, this and super are keywords, which is how they get away with all their magic, I suppose.