Two methods with the same name in java - java

I noticed that if I have two methods with the same name, the first one accepts SomeObject and the second one accepts an object extending SomeObject when I call the method with SomeOtherObject, it automatically uses the one that only accepts SomeObject. If I cast SomeOtherObject to SomeObject, the method that accepts SomeObject is used, even if the object is an instanceof SomeOtherObject. This means the method is selected when compiling. Why?

That's how method overload resolution in Java works: the method is selected at compile time.
For all of the ugly details, see the Java Language Specification §15.12.

This means the method is selected when compiling.
Yes you are correct. That is what it means.
Why?
I can think of four reasons why they designed Java this way:
This is consistent with the way that other statically typed OO languages that support overloading work. It is what people who come / came from the C++ world expect. (This was particularly important in the early days of Java ... though not so much now.). It is worth noting that C# handles overloading the same way.
It is efficient. Resolving method overloads at runtime (based on actual argument types) would make overloaded method calls expensive.
It gives more predictable (and therefore more easy to understand) behaviour.
It avoids the Brittle Base Class problem, where adding adding a new overloaded method in a base class causes unexpected problems in existing derived classes.
References:
http://blogs.msdn.com/b/ericlippert/archive/2004/01/07/virtual-methods-and-brittle-base-classes.aspx

Yes the function to be executed is decided at compile time! So JVM has no idea of the actual type of the Object at compile time. It only knows the type of the reference that points to the object given as argument to the function.
For more details you can look into Choosing the Most Specific Method in Java Specification.

Related

Java. How to prevent assigning a method reference to functional interface type argument?

Beginning with Java SE 8, if the formal parameter of a method is a functional interface, the argument can be either an object implementing that interface or a reference to some method. It means that the argument can also be a reference to a method that is not logically related to the interface's purpose. Is it possible to force the argument to be only an object implementing the interface, but not to be a method reference? Although it is possible to make the interface non-functional by adding a second abstract method, that additional method nevertheless should be implemented. Is there another way?
Is it possible to force the argument to be only an object implementing the interface, but not to be a method reference?
There is not.
Although it is possible to make the interface non-functional by adding a second abstract method, that additional method nevertheless should be implemented. Is there another way?
Indeed, that's the downside of doing that. You can't provide a default implementation, as an interface that has exactly 1 non-defaulted method is considered a FunctionalInterface, and you can't decree it to be not so.
What you can do, however, is turn that interface into an abstract class, which aren't eligible for being supplied in lambda/methodref form.
More generally, don't fight java features. If someone uses a method ref, they know what they are doing. Or they don't, but if they are just stumbling about without a clue, trust me, you can't stop an idiot from ruining a code base by designing good APIs and adding every linter rule you manage to scrounge together. Idiots are far too inventive to be stopped by mere mortals.

Can I insert instructions in constructors before calling this() / super() and before initialising any final fields?

Preface
I have been experimenting with ByteBuddy and ASM, but I am still a beginner in ASM and between beginner and advanced in ByteBuddy. This question is about ByteBuddy and about JVM bytecode limitations in general.
Situation
I had the idea of creating global mocks for testing by instrumenting constructors in such a way that instructions like these are inserted at the beginning of each constructor:
if (GlobalMockRegistry.isMock(getClass()))
return;
FYI, the GlobalMockRegistry basically wraps a Set<Class<?>> and if that set contains a certain class, then isMock(Class<?>> clazz) would return true. The advantage of that concept is that I can (de)activate global mocking for each class during runtime because if multiple tests run in the same JVM process, one test might need a certain global mock, the next one might not.
What the if(...) return; instructions above want to achieve is that if mocking is active, the constructor should not do anything:
no this() or super() calls, → update: impossible
no field initialisations, → update: possible
no other side effects. → update: might be possible, see my update below
The result would be an object with uninitialised fields that did not create any (possibly expensive) side effects such as resource allocation (database connection, file creation, you name it). Why would I want that? Could I not just create an instance with Objenesis and be happy? Not if I want a global mock, i.e. mock objects I cannot inject because they are created somewhere inside methods or field initialisers I do not have control over. Please do not worry about what method calls on such an object would do if its instance fields are not properly initialised. Just assume I have instrumented the methods to return stub results, too. I know how to do that already, the problem are only constructors in the context of this question.
Questions / problems
Now if I try to simulate the desired result in Java source code, I meet the following limitations:
I cannot insert any code before this() or super(). I could mitigate that by also instrumenting the super class hierarchy with the same if(...) return;, but would like to know if I could in theory use ASM to insert my code before this() or super() using a method visitor. Or would the byte code of the instrumented class somehow be verified during loading or retransformation and then rejected because the byte code is "illegal"? I would like to know before I start learning ASM because I want to avoid wasting time for an idea which is not feasible.
If the class contains final instance fields, I also cannot enter a return before all of those fields have been initialised in the constructor. That might happen at the very end of a complex constructor which performs lots of side effects before actually initialising the last field. So the question is similar to the previous one: Can I use ASM to insert my if(...) return; before any fields (including final ones) are initialised and produce a valid class which I could not produce using javac and will not be rejected when loaded or retransformed?
BTW, if it is relevant, we are talking about Java 8+, i.e. at the time of writing this that would be Java versions 8 to 14.
If anything about this question is unclear, please do not hesitate to ask follow-up questions, so I can improve it.
Update after discussing Antimony's answer
I think this approach could work and avoid side effects, calling the constructor chain but avoiding any side effects and resulting in a newly initialised instance with all fields empty (null, 0, false):
In order to avoid calling this.getClass(), I need to hard-code the mock target's class name directly into all constructors up the parent chain. I.e. if two "global mock" target classes have the same parent class(es), multiple of the following if blocks would be woven into each corresponding parent class, one for each hard-coded child class name.
In order to avoid any side effects from objects being created or methods being called, I need to call a super constructor myself, using null/zero/false values for each argument. That would not matter because the next parent class up the chain would have a similar code block so that the arguments given do not matter anyway.
// Avoid accessing 'this.getClass()'
if (GlobalMockRegistry.isMock(Sub.class)) {
// Identify and call any parent class constructor, ideally a default constructor.
// If none exists, call another one using default values like null, 0, false.
// In the class derived from Object, just call 'Object.<init>'.
super(null, 0, false);
return;
}
// Here follows the original byte code, i.e. the normal super/this call and
// everything else the original constructor does.
Note to myself: Antimony's answer explains "uninitialised this" very nicely. Another related answer can be found here.
Next update after evaluating my new idea
I managed to validate my new idea with a proof of concept. As my JVM byte code knowledge is too limited and I am not used to the way of thinking it requires (stack frames, local variable tables, "reverse" logic of first pushing/popping variables, then applying an operation on them, not being able to easily debug), I just implemented it in Javassist instead of ASM, which in comparison was a breeze after failing miserably with ASM after hours of trial & error.
I can take it from here and I want to thank user Antimony for his very instructive answer + comments. I do know that theoretically the same solution could be implemented using ASM, but it would be exceedingly difficult in comparison because its API is too low level for the task at hand. ByteBuddy's API is too high level, Javassist was just right for me in order to get quick results (and easily maintainable Java code) in this case.
Yes and no. Java bytecode is much less restrictive than Java (source) in this regard. You can put any bytecode you want before the constructor call, as long as you don't actually access the uninitialized object. (The only operations allowed on an uninitialized this value are calling a constructor, setting private fields declared in the same class, and comparing it against null).
Bytecode is also more flexible in where and how you make the constructor call. For example, you can call one of two different constructors in an if statement, or you can wrap the super constructor call in a "try block", both things that are impossible at the Java language level.
Apart from not accessing the uninitialized this value, the only restriction* is that the object has to be definitely initialized along any path that returns from the constructor call. This means the only way to avoid initializing the object is to throw an exception. While being much laxer than Java itself, the rules for Java bytecode were still very deliberately constructed so it is impossible to observe uninitialized objects. In general, Java bytecode is still required to be memory safe and type safe, just with a much looser type system than Java itself. Historically, Java applets were designed to run untrusted code in the JVM, so any method of bypassing these restrictions was a security vulnerability.
* The above is talking about traditional bytecode verification, as that is what I am most familiar with. I believe stackmap verification behaves similarly though, barring implementation bugs in some versions of Java.
P.S. Technically, Java can have code execute before the constructor call. If you pass arguments to the constructor, those expressions are evaluated first, and hence the ability to place bytecode before the constructor call is required in order to compile Java code. Likewise, the ability to set private fields declared in the same class is used to set synthetic variables that arise from the compilation of nested classes.
If the class contains final instance fields, I also cannot enter a return before all of those fields have been initialised in the constructor.
This, however, is eminently possible. The only restriction is that you call some constructor or superconstructor on the uninitialized this value. (Since all constructors recursively have this restriction, this will ultimately result in java.lang.Object's constructor being called). However, the JVM doesn't care what happens after that. In particular, it only cares that the fields have some well typed value, even if it is the default value (null for objects, 0 for ints, etc.) So there is no need to execute the field initializers to give them a meaningful value.
Is there any other way to get the type to be instantiated other than this.getClass() from a super class constructor?
Not as far as I am aware. There's no special opcode for magically getting the Class associated with a given value. Foo.class is just syntactic sugar which is handled by the Java compiler.

Difference between Java Interfaces and Python Mixin?

I have been reading about Python-Mixin and come to know that it adds some features (methods) to class. Similarly, Java-Interfaces also provide methods to class.
Only difference, I could see is that Java-interfaces are abstract methods and Python-Mixin carry implementation.
Any other differences ?
Well, the 'abstract methods' part is quite important.
Java is strongly typed. By specifying the interfaces in the type definition, you use them to construct the signature of the new type. After the type definition, you have promised that this new type (or some sub-class) will eventually implement all the functions that were defined in the various interfaces you specified.
Therefore, an interface DOES NOT really add any methods to a class, since it doesn't provide a method implementation. It just adds to the signature/promise of the class.
Python, however, is not strongly typed. The 'signature' of the type doesn't really matter, since it simply checks at run time whether the method you wish to call is actually present.
Therefore, in Python the mixin is indeed about adding methods and functionality to a class. It is not at all concerned with the type signature.
In summary:
Java Interfaces -> Functions are NOT added, signature IS extended.
Python mixins -> Functions ARE added, signature doesn't matter.

Virtual Mechanism in C++ and Java [duplicate]

In Java:
class Base {
public Base() { System.out.println("Base::Base()"); virt(); }
void virt() { System.out.println("Base::virt()"); }
}
class Derived extends Base {
public Derived() { System.out.println("Derived::Derived()"); virt(); }
void virt() { System.out.println("Derived::virt()"); }
}
public class Main {
public static void main(String[] args) {
new Derived();
}
}
This will output
Base::Base()
Derived::virt()
Derived::Derived()
Derived::virt()
However, in C++ the result is different:
Base::Base()
Base::virt() // ← Not Derived::virt()
Derived::Derived()
Derived::virt()
(See http://www.parashift.com/c++-faq-lite/calling-virtuals-from-ctors.html for C++ code)
What causes such a difference between Java and C++? Is it the time when vtable is initialized?
EDIT: I do understand Java and C++ mechanisms. What I want to know is the insights behind this design decision.
Both approaches clearly have disadvatages:
In Java, the call goes to a method which cannot use this properly because its members haven’t been initialised yet.
In C++, an unintuitive method (i.e. not the one in the derived class) is called if you don’t know how C++ constructs classes.
Why each language does what it does is an open question but both probably claim to be the “safer” option: C++’s way prevents the use of uninitialsed members; Java’s approach allows polymorphic semantics (to some extent) inside a class’ constructor (which is a perfectly valid use-case).
Well you have already linked to the FAQ's discussion, but that’s mainly problem-oriented, not going into the rationales, the why.
In short, it’s for type safety.
This is one of the few cases where C++ beats Java and C# on type safety. ;-)
When you create a class A, in C++ you can let each A constructor initialize the new instance so that all common assumptions about its state, called the class invariant, hold. For example, part of a class invariant can be that a pointer member points to some dynamically allocated memory. When each publicly available method preserves the class invariant, then it’s guaranteed to hold also on entry to each method, which greatly simplifies things – at least for a well-chosen class invariant!
No further checking is then necessary in each method.
In contrast, using two-phase initialization such as in Microsoft's MFC and ATL libraries you can never be quite sure whether everything has been properly initialized when a method (non-static member function) is called. This is very similar to Java and C#, except that in those languages the lack of class invariant guarantees comes from these languages merely enabling but not actively supporting the concept of a class invariant. In short, Java and C# virtual methods called from a base class constructor can be called down on a derived instance that has not yet been initialized, where the (derived) class invariant has not yet been established!
So, this C++ language support for class invariants is really great, helping do away with a lot of checking and a lot of frustrating perplexing bugs.
However, it makes a bit difficult to do derived class specific initialization in a base class constructor, e.g. doing general things in a topmost GUI Widget class’ constructor.
The FAQ item “Okay, but is there a way to simulate that behavior as if dynamic binding worked on the this object within my base class's constructor?” goes a little into that.
For a more full treatment of the most common case, see also my blog article “How to avoid post-construction by using Parts Factories”.
Regardless of how it's implemented, it's a difference in what the language definition says should happen. Java allows you to call functions on a derived object that hasn't been fully initialized (it has been zero-initialized, but its constructor has not run). C++ doesn't allow that; until the derived class's constructor has run, there is no derived class.
Hopefully this will help:
When your line new Derived() executes, the first thing that happens is the memory allocation. The program will allocate a chunk of memory big enough to hold both the members of Base and Derrived. At this point, there is no object. It's just uninitialized memory.
When Base's constructor has completed, the memory will contain an object of type Base, and the class invariant for Base should hold. There is still no Derived object in that memory.
During the construction of base, the Base object is in a partially-constructed state, but the language rules trust you enough to let you call your own member functions on a partially-constructed object. The Derived object isn't partially constructed. It doesn't exist.
Your call to the virtual function ends up calling the base class's version because at that point in time, Base is the most derived type of the object. If it were to call Derived::virt, it would be invoking a member function of Derived with a this-pointer that is not of type Derrived, breaking type safety.
Logically, a class is something that gets constructed, has functions called on it, and then gets destroyed. You can't call member functions on an object that hasn't been constructed, and you can't call member functions on an object after it's been destroyed. This is fairly fundamental to OOP, the C++ language rules are just helping you avoid doing things that break this model.
In Java, method invocation is based on object type, which is why it is behaving like that (I don't know much about c++).
Here your object is of type Derived, so jvm invokes method on Derived object.
If understand Virtual concept clearly, equivalent in java is abstract, your code right now is not really virtual code in java terms.
Happy to update my answer if something wrong.
Actually I want to know what's the insight behind this design decision
It may be that in Java, every type derives from Object, every Object is some kind of leaf type, and there's a single JVM in which all objects are constructed.
In C++, many types aren't virtual at all. Furthermore in C++, the base class and the subclass can be compiled to machine code separately: so the base class does what it does without whether it's a superclass of something else.
Constructors are not polymorphic in case of both C++ and Java languages, whereas a method could be polymorphic in both languages. This means, when a polymorphic method appears inside a constructor, the designers would be left with two choices.
Either strictly conform to the semantics on non-polymorphic
constructor and thus consider any polymorphic method invoked within a
constructor as non-polymorphic. This is how C++ does§.
Or, compromise
the strict semantics of non-polymorphic constructor and adhere to the
strict semantics of a polymorphic method. Thus polymorphic methods
from constructors are always polymorphic. This is how Java does.
Since none of the strategies offers or compromises any real benefits compared to other and yet Java way of doing it reduces lots of overhead (no need to differentiate polymorphism based on the context of constructors), and since Java was designed after C++, I would presume, the designer of Java opted for the 2nd option seeing the benefit of less implementation overhead.
Added on 21-Dec-2016
§Lest the statement “method invoked within a constructor as non-polymorphic...This is how C++ does” might be confusing without careful scrutiny of the context, I’m adding a formalization to precisely qualify what I meant.
If class C has a direct definition of some virtual function F and its ctor has an invocation to F, then any (indirect) invocation of C’s ctor on an instance of child class T will not influence the choice of F; and in fact, C::F will always be invoked from C’s ctor. In this sense, invocation of virtual F is less-polymorphic (compared to say, Java which will choose F based on T)
Further, it is important to note that, if C inherits definition of F from some parent P and has not overriden F, then C’s ctor will invoke P::F and even this, IMHO, can be determined statically.

What is the internal identification of a Java method?

As we know, in Java, method name is not sufficient to distinguish different methods.
I think (may be wrong), to distinguish a method, it needs the following info:
(className, methodName, methodParameters)
Further,
how to identify a method more efficiently internally?
I heard of "method id". Does it mean there is a mapping between the above triple and an integer, so JVM use only method id after parsing?
If so, is it resided in symbol table?
Thanks!
It's a CONSTANT_NameAndType_info Structure pointing at a method descriptor.
It pretty much consists of the method name, the parameter types, and (somewhat surprisingly) the return type.
I do not understand very well what you are trying to do but I think there are some possible answers nonetheless:
You may be interested in the JNI Method Descriptors, one of the various string formats used internally by the JVM (and by JNI libraries) for identifying Java elements.
It is difficult to know about what you are talking about. The "method id" can be a reference for a java.lang.reflect.Method object, or can be the method descriptor mentioned below, or any other thing. Where did you read about it?
I doubt there is such table inside the JVM. I mean, I doubt there is a global table, because almost always you retrieve a method from a class, even when dealing with it inside the JVM, so it is reasonable to believe the method is stored in the class. It is likewhen we use reflection to retrieve a method:
Class clazz = String.class;
Method method = clazz.getDeclaredMethod("charAt", Integer.TYPE);
System.out.println(method.getName());
Note that I ask the class String for the method, instead of asking some util class to give me the method charAt, which receives an int and is from the class String.
In other words, your identification tuple is almost correct - it just does not have a class:
(methodName, methodParameters)
and, instead of retrieving the method from the JVM passing the class and then the method name and then the parameter types, you retrieve the method directly from the class, giving the class the method name and the parameter types. A subtle difference, for sure, but I think it is what you are wondering about.
This is evident even in the JNI descriptors I mentioned below. For example, the method
long f(int i, Class c);
is represented by the following descriptor:
"(ILjava/lang/Class;)J"
Note that there is no reference to the class of the method.
The excellent documentation on the class file format (already pointed by #Lawence) may give you some insights. I recommend you to read it fully.
1) How to identify a method more efficiently internally?
Internally to what? There are many places where a method might need to be "identified" "internally". In the bytecode compiler, the JIT compiler, the classloader / linker, the classfile representation, reflection API, a debugger and so on. They each have different efficiency concerns.
2) I heard of "method id". Does it mean there is a mapping between the above triple and an integer, so JVM use only method id after parsing?
A method id is used in the classfile representation, and could be used by anything based on that, including the class loader / linker, the JIT compiler and the debugger.
The JVM doesn't parse Java code.
3) If so, is it resided in symbol table?
It might do. It depends on what you mean by "the symbol table". Bear in mind that there are lots of places where method identification is required, throughout the lifecycle of a class. For instance, the Java reflection APIs require method information to implement methods such as getDeclaredMethod(...) and various methods of Method.
Java always differentiate its language elements by their fully qualified names.
Suppose you have a method myMethod(int a, int b) in class MyClass which lies in the package com.mypackage then java will identify the method with the name com.mypackage.MyClass.myMethod(int a , int b).
Just to give you some more insight, it also takes the Class Loader into consideration when there is a need to resolve two identical elements.
It does consider, which class loader was used to load the particular class containing the method to which you are referring. There are four types of class loaders in java. You can read the documention for java.lang.Thread class for this.

Categories