In C++ when a virtual function is called from within a constructor it doesn't behave like a virtual function.
I think everyone who encountered this behavior for the first time was surprised but on second thought it made sense:
As long as the derived constructor has not been executed the object is not yet a derived instance.
So how can a derived function be called? The preconditions haven't had the chance to be set up. Example:
class base {
public:
base()
{
std::cout << "foo is " << foo() << std::endl;
}
virtual int foo() { return 42; }
};
class derived : public base {
int* ptr_;
public:
derived(int i) : ptr_(new int(i*i)) { }
// The following cannot be called before derived::derived due to how C++ behaves,
// if it was possible... Kaboom!
virtual int foo() { return *ptr_; }
};
It's exactly the same for Java and .NET yet they chose to go the other way, and is possibly the only reason for the principle of least surprise?
Which do you think is the correct choice?
There's a fundamental difference in how the languages define an object's life time. In Java and .Net the object members are zero/null initialized before any constructor is run and is at this point that the object life time begins. So when you enter the constructor you've already got an initialized object.
In C++ the object life time only begins when the constructor finishes (although member variables and base classes are fully constructed before it starts). This explains the behaviour when virtual functions are called and also why the destructor isn't run if there's an exception in the constructor's body.
The problem with the Java/.Net definition of object lifetime is that it's harder to make sure the object always meets its invariant without having to put in special cases for when the object is initialized but the constructor hasn't run. The problem with the C++ definition is that you have this odd period where the object is in limbo and not fully constructed.
Both ways can lead to unexpected results. Your best bet is to not call a virtual function in your constructor at all.
The C++ way I think makes more sense, but leads to expectation problems when someone reviews your code. If you are aware of this situation, you should purposely not put your code in this situation for later debugging's sake.
Virtual functions in constructors, why do languages differ?
Because there's no one good behaviour. I find the C++ behaviour makes more sense (since base class c-tors are called first, it stands to reason that they should call base class virtual functions--after all, the derived class c-tor hasn't run yet, so it may not have set up the right preconditions for the derived class virtual function).
But sometimes, where I want to use the virtual functions to initialize state (so it doesn't matter that they're being called with the state uninitialized) the C#/Java behaviour is nicer.
I think C++ offers the best semantics in terms of having the 'most correct' behavior ... however it is more work for the compiler and the code is definitiely non-intuitive to someone reading it later.
With the .NET approach the function must be very limited not to rely on any derived object state.
Delphi makes good use of virtual constructors in the VCL GUI framework:
type
TComponent = class
public
constructor Create(AOwner: TComponent); virtual; // virtual constructor
end;
TMyEdit = class(TComponent)
public
constructor Create(AOwner: TComponent); override; // override virtual constructor
end;
TMyButton = class(TComponent)
public
constructor Create(AOwner: TComponent); override; // override virtual constructor
end;
TComponentClass = class of TComponent;
function CreateAComponent(ComponentClass: TComponentClass; AOwner: TComponent): TComponent;
begin
Result := ComponentClass.Create(AOwner);
end;
var
MyEdit: TMyEdit;
MyButton: TMyButton;
begin
MyEdit := CreateAComponent(TMyEdit, Form) as TMyEdit;
MyButton := CreateAComponent(TMyButton, Form) as TMyButton;
end;
I have found the C++ behavior very annoying. You cannot write virtual functions to, for instance, return the desired size of the object, and have the default constructor initialize each item. For instance it would be nice to do:
BaseClass() {
for (int i=0; i<virtualSize(); i++)
initialize_stuff_for_index(i);
}
Then again the advantage of C++ behavior is that it discourages constuctors like the above from being written.
I don't think the problem of calling methods that assume the constructor has been finished is a good excuse for C++. If this really was a problem then the constructor would not be allowed to call any methods, since the same problem can apply to methods for the base class.
Another point against C++ is that the behavior is much less efficient. Although the constructor knows directly what it calls, the vtab pointer has to be changed for every single class from base to final, because the constructor might call other methods that will call virtual functions. From my experience this wastes far more time than is saved by making virtual functions calls in the constructor more efficient.
Far more annoying is that this is also true of destructors. If you write a virtual cleanup() function, and the base class destructor does cleanup(), it certainly does not do what you expect.
This and the fact that C++ calls destructors on static objects on exit have really pissed me off for a long time.
Related
Are there currently (Java 6) things you can do in Java bytecode that you can't do from within the Java language?
I know both are Turing complete, so read "can do" as "can do significantly faster/better, or just in a different way".
I'm thinking of extra bytecodes like invokedynamic, which can't be generated using Java, except that specific one is for a future version.
After working with Java byte code for quite a while and doing some additional research on this matter, here is a summary of my findings:
Execute code in a constructor before calling a super constructor or auxiliary constructor
In the Java programming language (JPL), a constructor's first statement must be an invocation of a super constructor or another constructor of the same class. This is not true for Java byte code (JBC). Within byte code, it is absolutely legitimate to execute any code before a constructor, as long as:
Another compatible constructor is called at some time after this code block.
This call is not within a conditional statement.
Before this constructor call, no field of the constructed instance is read and none of its methods is invoked. This implies the next item.
Set instance fields before calling a super constructor or auxiliary constructor
As mentioned before, it is perfectly legal to set a field value of an instance before calling another constructor. There even exists a legacy hack which makes it able to exploit this "feature" in Java versions before 6:
class Foo {
public String s;
public Foo() {
System.out.println(s);
}
}
class Bar extends Foo {
public Bar() {
this(s = "Hello World!");
}
private Bar(String helper) {
super();
}
}
This way, a field could be set before the super constructor is invoked which is however not longer possible. In JBC, this behavior can still be implemented.
Branch a super constructor call
In Java, it is not possible to define a constructor call like
class Foo {
Foo() { }
Foo(Void v) { }
}
class Bar() {
if(System.currentTimeMillis() % 2 == 0) {
super();
} else {
super(null);
}
}
Until Java 7u23, the HotSpot VM's verifier did however miss this check which is why it was possible. This was used by several code generation tools as a sort of a hack but it is not longer legal to implement a class like this.
The latter was merely a bug in this compiler version. In newer compiler versions, this is again possible.
Define a class without any constructor
The Java compiler will always implement at least one constructor for any class. In Java byte code, this is not required. This allows the creation of classes that cannot be constructed even when using reflection. However, using sun.misc.Unsafe still allows for the creation of such instances.
Define methods with identical signature but with different return type
In the JPL, a method is identified as unique by its name and its raw parameter types. In JBC, the raw return type is additionally considered.
Define fields that do not differ by name but only by type
A class file can contain several fields of the same name as long as they declare a different field type. The JVM always refers to a field as a tuple of name and type.
Throw undeclared checked exceptions without catching them
The Java runtime and the Java byte code are not aware of the concept of checked exceptions. It is only the Java compiler that verifies that checked exceptions are always either caught or declared if they are thrown.
Use dynamic method invocation outside of lambda expressions
The so-called dynamic method invocation can be used for anything, not only for Java's lambda expressions. Using this feature allows for example to switch out execution logic at runtime. Many dynamic programming languages that boil down to JBC improved their performance by using this instruction. In Java byte code, you could also emulate lambda expressions in Java 7 where the compiler did not yet allow for any use of dynamic method invocation while the JVM already understood the instruction.
Use identifiers that are not normally considered legal
Ever fancied using spaces and a line break in your method's name? Create your own JBC and good luck for code review. The only illegal characters for identifiers are ., ;, [ and /. Additionally, methods that are not named <init> or <clinit> cannot contain < and >.
Reassign final parameters or the this reference
final parameters do not exist in JBC and can consequently be reassigned. Any parameter, including the this reference is only stored in a simple array within the JVM what allows to reassign the this reference at index 0 within a single method frame.
Reassign final fields
As long as a final field is assigned within a constructor, it is legal to reassign this value or even not assign a value at all. Therefore, the following two constructors are legal:
class Foo {
final int bar;
Foo() { } // bar == 0
Foo(Void v) { // bar == 2
bar = 1;
bar = 2;
}
}
For static final fields, it is even allowed to reassign the fields outside of
the class initializer.
Treat constructors and the class initializer as if they were methods
This is more of a conceptional feature but constructors are not treated any differently within JBC than normal methods. It is only the JVM's verifier that assures that constructors call another legal constructor. Other than that, it is merely a Java naming convention that constructors must be called <init> and that the class initializer is called <clinit>. Besides this difference, the representation of methods and constructors is identical. As Holger pointed out in a comment, you can even define constructors with return types other than void or a class initializer with arguments, even though it is not possible to call these methods.
Create asymmetric records*.
When creating a record
record Foo(Object bar) { }
javac will generate a class file with a single field named bar, an accessor method named bar() and a constructor taking a single Object. Additionally, a record attribute for bar is added. By manually generating a record, it is possible to create, a different constructor shape, to skip the field and to implement the accessor differently. At the same time, it is still possible to make the reflection API believe that the class represents an actual record.
Call any super method (until Java 1.1)
However, this is only possible for Java versions 1 and 1.1. In JBC, methods are always dispatched on an explicit target type. This means that for
class Foo {
void baz() { System.out.println("Foo"); }
}
class Bar extends Foo {
#Override
void baz() { System.out.println("Bar"); }
}
class Qux extends Bar {
#Override
void baz() { System.out.println("Qux"); }
}
it was possible to implement Qux#baz to invoke Foo#baz while jumping over Bar#baz. While it is still possible to define an explicit invocation to call another super method implementation than that of the direct super class, this does no longer have any effect in Java versions after 1.1. In Java 1.1, this behavior was controlled by setting the ACC_SUPER flag which would enable the same behavior that only calls the direct super class's implementation.
Define a non-virtual call of a method that is declared in the same class
In Java, it is not possible to define a class
class Foo {
void foo() {
bar();
}
void bar() { }
}
class Bar extends Foo {
#Override void bar() {
throw new RuntimeException();
}
}
The above code will always result in a RuntimeException when foo is invoked on an instance of Bar. It is not possible to define the Foo::foo method to invoke its own bar method which is defined in Foo. As bar is a non-private instance method, the call is always virtual. With byte code, one can however define the invocation to use the INVOKESPECIAL opcode which directly links the bar method call in Foo::foo to Foo's version. This opcode is normally used to implement super method invocations but you can reuse the opcode to implement the described behavior.
Fine-grain type annotations
In Java, annotations are applied according to their #Target that the annotations declares. Using byte code manipulation, it is possible to define annotations independently of this control. Also, it is for example possible to annotate a parameter type without annotating the parameter even if the #Target annotation applies to both elements.
Define any attribute for a type or its members
Within the Java language, it is only possible to define annotations for fields, methods or classes. In JBC, you can basically embed any information into the Java classes. In order to make use of this information, you can however no longer rely on the Java class loading mechanism but you need to extract the meta information by yourself.
Overflow and implicitly assign byte, short, char and boolean values
The latter primitive types are not normally known in JBC but are only defined for array types or for field and method descriptors. Within byte code instructions, all of the named types take the space 32 bit which allows to represent them as int. Officially, only the int, float, long and double types exist within byte code which all need explicit conversion by the rule of the JVM's verifier.
Not release a monitor
A synchronized block is actually made up of two statements, one to acquire and one to release a monitor. In JBC, you can acquire one without releasing it.
Note: In recent implementations of HotSpot, this instead leads to an IllegalMonitorStateException at the end of a method or to an implicit release if the method is terminated by an exception itself.
Add more than one return statement to a type initializer
In Java, even a trivial type initializer such as
class Foo {
static {
return;
}
}
is illegal. In byte code, the type initializer is treated just as any other method, i.e. return statements can be defined anywhere.
Create irreducible loops
The Java compiler converts loops to goto statements in Java byte code. Such statements can be used to create irreducible loops, which the Java compiler never does.
Define a recursive catch block
In Java byte code, you can define a block:
try {
throw new Exception();
} catch (Exception e) {
<goto on exception>
throw Exception();
}
A similar statement is created implicitly when using a synchronized block in Java where any exception while releasing a monitor returns to the instruction for releasing this monitor. Normally, no exception should occur on such an instruction but if it would (e.g. the deprecated ThreadDeath), the monitor would still be released.
Call any default method
The Java compiler requires several conditions to be fulfilled in order to allow a default method's invocation:
The method must be the most specific one (must not be overridden by a sub interface that is implemented by any type, including super types).
The default method's interface type must be implemented directly by the class that is calling the default method. However, if interface B extends interface A but does not override a method in A, the method can still be invoked.
For Java byte code, only the second condition counts. The first one is however irrelevant.
Invoke a super method on an instance that is not this
The Java compiler only allows to invoke a super (or interface default) method on instances of this. In byte code, it is however also possible to invoke the super method on an instance of the same type similar to the following:
class Foo {
void m(Foo f) {
f.super.toString(); // calls Object::toString
}
public String toString() {
return "foo";
}
}
Access synthetic members
In Java byte code, it is possible to access synthetic members directly. For example, consider how in the following example the outer instance of another Bar instance is accessed:
class Foo {
class Bar {
void bar(Bar bar) {
Foo foo = bar.Foo.this;
}
}
}
This is generally true for any synthetic field, class or method.
Define out-of-sync generic type information
While the Java runtime does not process generic types (after the Java compiler applies type erasure), this information is still attcheched to a compiled class as meta information and made accessible via the reflection API.
The verifier does not check the consistency of these meta data String-encoded values. It is therefore possible to define information on generic types that does not match the erasure. As a concequence, the following assertings can be true:
Method method = ...
assertTrue(method.getParameterTypes() != method.getGenericParameterTypes());
Field field = ...
assertTrue(field.getFieldType() == String.class);
assertTrue(field.getGenericFieldType() == Integer.class);
Also, the signature can be defined as invalid such that a runtime exception is thrown. This exception is thrown when the information is accessed for the first time as it is evaluated lazily. (Similar to annotation values with an error.)
Append parameter meta information only for certain methods
The Java compiler allows for embedding parameter name and modifier information when compiling a class with the parameter flag enabled. In the Java class file format, this information is however stored per-method what makes it possible to only embed such method information for certain methods.
Mess things up and hard-crash your JVM
As an example, in Java byte code, you can define to invoke any method on any type. Usually, the verifier will complain if a type does not known of such a method. However, if you invoke an unknown method on an array, I found a bug in some JVM version where the verifier will miss this and your JVM will finish off once the instruction is invoked. This is hardly a feature though, but it is technically something that is not possible with javac compiled Java. Java has some sort of double validation. The first validation is applied by the Java compiler, the second one by the JVM when a class is loaded. By skipping the compiler, you might find a weak spot in the verifier's validation. This is rather a general statement than a feature, though.
Annotate a constructor's receiver type when there is no outer class
Since Java 8, non-static methods and constructors of inner classes can declare a receiver type and annotate these types. Constructors of top-level classes cannot annotate their receiver type as they most not declare one.
class Foo {
class Bar {
Bar(#TypeAnnotation Foo Foo.this) { }
}
Foo() { } // Must not declare a receiver type
}
Since Foo.class.getDeclaredConstructor().getAnnotatedReceiverType() does however return an AnnotatedType representing Foo, it is possible to include type annotations for Foo's constructor directly in the class file where these annotations are later read by the reflection API.
Use unused / legacy byte code instructions
Since others named it, I will include it as well. Java was formerly making use of subroutines by the JSR and RET statements. JBC even knew its own type of a return address for this purpose. However, the use of subroutines did overcomplicate static code analysis which is why these instructions are not longer used. Instead, the Java compiler will duplicate code it compiles. However, this basically creates identical logic which is why I do not really consider it to achieve something different. Similarly, you could for example add the NOOP byte code instruction which is not used by the Java compiler either but this would not really allow you to achieve something new either. As pointed out in the context, these mentioned "feature instructions" are now removed from the set of legal opcodes which does render them even less of a feature.
As far as I know there are no major features in the bytecodes supported by Java 6 that are not also accessible from Java source code. The main reason for this is obviously that the Java bytecode was designed with the Java language in mind.
There are some features that are not produced by modern Java compilers, however:
The ACC_SUPER flag:
This is a flag that can be set on a class and specifies how a specific corner case of the invokespecial bytecode is handled for this class. It is set by all modern Java compilers (where "modern" is >= Java 1.1, if I remember correctly) and only ancient Java compilers produced class files where this was un-set. This flag exists only for backwards-compatibility reasons. Note that starting with Java 7u51, ACC_SUPER is ignored completely due to security reasons.
The jsr/ret bytecodes.
These bytecodes were used to implement sub-routines (mostly for implementing finally blocks). They are no longer produced since Java 6. The reason for their deprecation is that they complicate static verification a lot for no great gain (i.e. code that uses can almost always be re-implemented with normal jumps with very little overhead).
Having two methods in a class that only differ in return type.
The Java language specification does not allow two methods in the same class when they differ only in their return type (i.e. same name, same argument list, ...). The JVM specification however, has no such restriction, so a class file can contain two such methods, there's just no way to produce such a class file using the normal Java compiler. There's a nice example/explanation in this answer.
Here are some features that can be done in Java bytecode but not in Java source code:
Throwing a checked exception from a method without declaring that the method throws it. The checked and unchecked exceptions are a thing which is checked only by the Java compiler, not the JVM. Because of this for example Scala can throw checked exceptions from methods without declaring them. Though with Java generics there is a workaround called sneaky throw.
Having two methods in a class that only differ in return type, as already mentioned in Joachim's answer: The Java language specification does not allow two methods in the same class when they differ only in their return type (i.e. same name, same argument list, ...). The JVM specification however, has no such restriction, so a class file can contain two such methods, there's just no way to produce such a class file using the normal Java compiler. There's a nice example/explanation in this answer.
GOTO can be used with labels to create your own control structures (other than for while etc)
You can override the this local variable inside a method
Combining both of these you can create create tail call optimised bytecode (I do this in JCompilo)
As a related point you can get parameter name for methods if compiled with debug (Paranamer does this by reading the bytecode
Maybe section 7A in this document is of interest, although it's about bytecode pitfalls rather than bytecode features.
In Java language the first statement in a constructor must be a call to the super class constructor. Bytecode does not have this limitation, instead the rule is that the super class constructor or another constructor in the same class must be called for the object before accessing the members. This should allow more freedom such as:
Create an instance of another object, store it in a local variable (or stack) and pass it as a parameter to super class constructor while still keeping the reference in that variable for other use.
Call different other constructors based on a condition. This should be possible: How to call a different constructor conditionally in Java?
I have not tested these, so please correct me if I'm wrong.
Something you can do with byte code, rather than plain Java code, is generate code which can loaded and run without a compiler. Many systems have JRE rather than JDK and if you want to generate code dynamically it may be better, if not easier, to generate byte code instead of Java code has to be compiled before it can be used.
I wrote a bytecode optimizer when I was a I-Play, (it was designed to reduce the code size for J2ME applications). One feature I added was the ability to use inline bytecode (similar to inline assembly language in C++). I managed to reduce the size of a function that was part of a library method by using the DUP instruction, since I need the value twice. I also had zero byte instructions (if you are calling a method that takes a char and you want to pass an int, that you know does not need to be cast I added int2char(var) to replace char(var) and it would remove the i2c instruction to reduce the size of the code. I also made it do float a = 2.3; float b = 3.4; float c = a + b; and that would be converted to fixed point (faster, and also some J2ME did not support floating point).
In Java, if you attempt to override a public method with a protected method (or any other reduction in access), you get an error: "attempting to assign weaker access privileges". If you do it with JVM bytecode, the verifier is fine with it, and you can call these methods via the parent class as if they were public.
A sample code segment in java where the parent class fun() is overridden by the child class fun():
class class1
{
//members
void fun()
{
//some code
}
}
class2 extends class1
{
//members
void fun()
{
//some code
}
}
class overriding
{
public static void main(String args[])
{
class2 obj = new class2();
obj.fun();
}
}
In the above code, as we know that the binding of the actual code associated with the function call(obj.fun()) is done during run-time by the interpreter. During compilation the compiler is unable to bind the actual code with the call.
My question is:
Without instantiation(after creating an object of a class) does an ideal compiler have no way to know which function is to be invoked with response to a function call? Or it is done the way(as it is in java, for example) that dynamic binding has an edge over static binding in any programming paradigm?
My question is generalized.With regard to the code above,does the compiler has no way at all to know actually which fun() is called in the obj.fun() statement or technically run time binding is invented because it is impossible to bind at compile time?
Java is not only interpreter. Modern JVMs include a very sophisticated JIT-compiler which is capable to devirtualize many calls including the case you are speaking about. When your program starts, it will be executed by interpreter which will do the virtual call. But if your main method is executed many times, it will be compiled to native code by JIT compiler. This particular case is very simple, so JIT compiler will be able to replace virtual call with direct call and even inline the fun body into the main method. HotSpot JIT compiler can even optimize the cases where exact target method is unknown, but in the most of the cases the same method was called during the profiling phase (first several thousand calls). It will inline the most common case and keep the virtual call for the less common.
A compiler must obey the language spec, and language specs can differ even between languages as similar as C# and java.
If I'm not mistaken, C# defaults to binding methods to the declared type of an object, and if you want the "real" run-time binding then you need to declare methods virtual. Java specifies that C#'s "virtual" should always be the behaviour for instance (non-static) methods. Such differences in language spec cause differences in what a compiler can/cannot or should/should not do when compiling.
In this light, it's a bit unclear what your actual question is ...
In java the method you are calling is decided at compile time, not by the type of the object but by the type of the variable (unless typecast). obj is a variable of type class2 so you are calling the fun method of that class. You can read this question for more info. Java overloading method selection
I'm learning Scala at the moment and I came across this statement in Odersky's Programming Scala 2nd edition:
one way in which Scala is more object-orientated than Java is that classes in Scala cannot have static members.
I'm not sufficiently experienced in either Java or Scala to understand that comparison. Why does having static members make a language less OO?
Odersky's statement is valid and significant, but some people don't understand what he meant.
Let's say that in Java you have a class Foo with method f:
class Foo {
int f() { /* does something great */ }
}
You can write a method that takes a Foo and invokes f on it:
void g(Foo foo) { foo.f(); }
Perhaps there is a class SubFoo that extends Foo; g works on that too. There can be a whole set of classes, related by inheritance or by an interface, which share the fact that they can be used with g.
Now let's make that f method static:
class Foo {
static int f() { /* does something great */ }
}
Can we use this new Foo with g, perhaps like so?
g(Foo); // No, this is nonsense.
Darn. OK, let's change the signature of g so that we can pass Foo to it and have it invoke f.
Ooops -- we can't. We can't pass around a reference to Foo because Foo is not an instance of some class. Some people commenting here are confused by the fact that there is a Class object corresponding to Foo, but as Sotirios tried to explain, that Class object does not have an f method and Foo is not an instance of that class. Foo is not an instance of anything; it is not an object at all. The Class object for Foo is an instance of class Class that has information about Foo (think of it as Foo's internal Wikipedia page), and is completely irrelevant to the discussion. The Wikipedia page for "tiger" is not a tiger.
In Java, "primitives" like 3 and 'x' are not objects. They are objects in Scala. For performance your program will use JVM primitives for 3 and 'x' wherever possible during execution, but at the level you code in they really are objects. The fact that they are not objects in Java has rather unfortunate consequences for anyone trying to write code that handles all data types -- you have to have special logic and additional methods to cover primitives. If you've ever seen or written that kind of code, you know that it's awful. Odersky's statement is not "purism"; far from it.
In Scala there is no piece of runtime data that is not an object, and there is no thing you can invoke methods on that is not an object. In Java neither of these statements in true; Java is a partially object-oriented language. In Java there are things which are not objects and there are methods which aren't on objects.
Newcomers to Scala often think of object Foo as some weird replacement for Java statics, but that's something you need to get past quickly. Instead think of Java's static methods as a non-OO wart and Scala's object Foo { ... } as something along these lines:
class SomeHiddenClass { ... }
val Foo = new SomeHiddenClass // the only instance of it
Here Foo is a value, not a type, and it really is an object. It can be passed to a method. It can extend some other class. For example:
abstract class AbFoo { def f:Int }
object Foo extends AbFoo { def f = 2 }
Now, finally, you can say
g(Foo)
It is true that a "companion object" for a class is a good place to put non-instance methods and data for the class. But that companion object is an object, so the usual rules and capabilities apply.
The fact that in Java you put such methods on non-objects -- limiting how they can be used -- is a liability, not a feature. It is certainly not OO.
I am not sure I completely buy that argument, but here is one possible reasoning.
To an object-oriented purist, everything should be an object, and all state should be encapsulated by objects. Any static member of a class is by definition state which exists outside of an object, because you can use it and manipulate it without instantiating an object. Thus, the lack of static class members makes for a more pure object-oriented language.
Well, with static members like methods you don't have any objects to create and nevertheless you can call such static methods. You only need the static classname in order to set the namespace for these methods, for example:
long timeNow = System.currentTimeMillis(); // no object creation
This rather gives a feeling like in procedural languages.
static members belongs to the Class not to the object while the main concept of oop lies among the relation between the individual objects of dirrefer Class.
A static method in Java is one that operates on the class itself, and doesn't need an Object to be created first. For example, this line:
int c = Integer.parseInt("5");
Integer.parseInt() is static because I didn't have to go Integer i = new Integer(); before using it; this isn't operating on any particular object that I've created, since it's always going to be the same, and is more like a typical procedural function call instead of an object-oriented method. It's more object-oriented if I have to create an object for every call and we encapsulate everything that way instead of allowing static to use methods as faux-procedural-functions.
There are several competing definitions of what exactly object-orientation means. However, there is one thing they all can agree on: dynamic dispatch is a fundamental part of the definition of OO.
Static methods are static (duh), not dynamic, ergo they are by definition not object-oriented.
And logically, a language that has a feature which isn't object-oriented is in some sense "less OO" than a language which doesn't have said feature.
In a recent question, someone asked about static methods and one of the answers stated that you generally call them with something like:
MyClassName.myStaticMethod();
The comments on that also stated that you could also call it via an object with:
MyClassName myVar;
myVar.myStaticMethod();
but that it was considered bad form.
Now it seems to me that doing this can actually make my life easier so I don't have to worry about what's static or not (a).
Is there some problem with calling static functions via an object? Obviously you wouldn't want to create a brand new object just to call it:
Integer xyzzy;
int plugh = xyzzy.parseInt ("42", 10);
But, if you already have an object of the desired type, is there a problem in using it?
(a) Obviously, I can't call a non-static method with:
MyClassName.myNonStaticMethod();
but that's not the issue I'm asking about here.
In my opinion, the real use case that makes this so unreadable is something like this. What does the code below print?
//in a main method somewhere
Super instance = new Sub();
instance.method();
//...
public class Super {
public static void method() {
System.out.println("Super");
}
}
public class Sub extends Super {
public static void method() {
System.out.println("Sub");
}
}
The answer is that it prints "Super", because static methods are not virtual. Even though instance is-a Sub, the compiler can only go on the type of the variable which is Super. However, by calling the method via instance rather than Super, you are subtly implying that it will be virtual.
In fact, a developer reading the instance.method() would have to look at the declaration of the method (its signature) to know which method it actually being called. You mention
it seems to me that doing this can actually make my life easier so I don't have to worry about what's static or not
But in the case above context is actually very important!
I can fairly confidently say it was a mistake for the language designers to allow this in the first place. Stay away.
The bad form comment comes from the Coding Conventions for Java
See http://www.oracle.com/technetwork/java/codeconventions-137265.html#587
The reason for it, if you think about it, is that the method, being static, does not belong to any particular object. Because it belongs to the class, why would you want to elevate a particular object to such a special status that it appears to own a method?
In your particular example, you can use an existing integer through which to call parseInt (that is, it is legal in Java) but that puts the reader's focus on that particular integer object. It can be confusing to readers, and therefore the guidance is to avoid this style.
Regarding this making life easier for you the programmer, turn it around and ask what makes life easier on the reader? There are two kinds of methods: instance and static. When you see a the expression C.m and you know C is a class, you know m must be a static method. When you see x.m (where x is an instance) you can't tell, but it looks like an instance method and so most everyone reserves this syntax for instance methods only.
It's a matter of communication. Calling a method on an instance implies you're acting on/with that particular instance, not on/with the instance's class.
It might get super confusing when you have an object that inherits from another object, overriding its static method. Especially if you're referring to the object as a type of its ancestor class. It wouldn't be obvious as to which method you're running.
I know that calling a virtual method from a base class constructor can be dangerous since the child class might not be in a valid state. (at least in C#)
My question is what if the virtual method is the one who initializes the state of the object ? Is it good practice or should it be a two step process, first to create the object and then to load the state ?
First option: (using the constructor to initialize the state)
public class BaseObject {
public BaseObject(XElement definition) {
this.LoadState(definition);
}
protected abstract LoadState(XElement definition);
}
Second option: (using a two step process)
public class BaseObject {
public void LoadState(XElement definition) {
this.LoadStateCore(definition);
}
protected abstract LoadStateCore(XElement definition);
}
In the first method the consumer of the code can create and initialize the object with one statement:
// The base class will call the virtual method to load the state.
ChildObject o = new ChildObject(definition)
In the second method the consumer will have to create the object and then load the state:
ChildObject o = new ChildObject();
o.LoadState(definition);
(This answer applies to C# and Java. I believe C++ works differently on this matter.)
Calling a virtual method in a constructor is indeed dangerous, but sometimes it can end up with the cleanest code.
I would try to avoid it where possible, but without bending the design hugely. (For instance, the "initialize later" option prohibits immutability.) If you do use a virtual method in the constructor, document it very strongly. So long as everyone involved is aware of what it's doing, it shouldn't cause too many problems. I would try to limit the visibility though, as you've done in your first example.
EDIT: One thing which is important here is that there's a difference between C# and Java in order of initialization. If you have a class such as:
public class Child : Parent
{
private int foo = 10;
protected override void ShowFoo()
{
Console.WriteLine(foo);
}
}
where the Parent constructor calls ShowFoo, in C# it will display 10. The equivalent program in Java would display 0.
In C++, calling a virtual method in a base class constructor will simply call the method as if the derived class doesn't exist yet (because it doesn't). So that means that the call is resolved at compile time to whatever method it should call in the base class (or classes it derived from).
Tested with GCC, it allows you to call a pure virtual function from a constructor, but it gives a warning, and results in a link time error. It appears that this behavior is undefined by the standard:
"Member functions can be called from a constructor (or destructor) of an abstract class; the effect of making a virtual call (class.virtual) to a pure virtual function directly or indirectly for the object being created (or destroyed) from such a constructor (or destructor) is undefined."
With C++ the virtual methods are routed through the vtable for the class that is being constructed. So in your example it would generate a pure virtual method exception since whilst BaseObject is being constructed there simply is no LoadStateCore method to invoke.
If the function is not abstract, but simply does nothing then you will often get the programmer scratching their head for a while trying to remember why it is that the function doesn't actually get called.
For this reason you simply can't do it this way in C++ ...
For C++ the base constructor is called before the derived constructor, which means that the virtual table (which holds the addresses of the derived class's overridden virtual functions) does not yet exist. For this reason, it is considered a VERY dangerous thing to do (especially if the functions are pure virtual in the base class...this will cause a pure-virtual exception).
There are two ways around this:
Do a two-step process of construction + initialization
Move the virtual functions to an internal class that you can more closely control (can make use of the above approach, see example for details)
An example of (1) is:
class base
{
public:
base()
{
// only initialize base's members
}
virtual ~base()
{
// only release base's members
}
virtual bool initialize(/* whatever goes here */) = 0;
};
class derived : public base
{
public:
derived ()
{
// only initialize derived 's members
}
virtual ~derived ()
{
// only release derived 's members
}
virtual bool initialize(/* whatever goes here */)
{
// do your further initialization here
// return success/failure
}
};
An example of (2) is:
class accessible
{
private:
class accessible_impl
{
protected:
accessible_impl()
{
// only initialize accessible_impl's members
}
public:
static accessible_impl* create_impl(/* params for this factory func */);
virtual ~accessible_impl()
{
// only release accessible_impl's members
}
virtual bool initialize(/* whatever goes here */) = 0;
};
accessible_impl* m_impl;
public:
accessible()
{
m_impl = accessible_impl::create_impl(/* params to determine the exact type needed */);
if (m_impl)
{
m_impl->initialize(/* ... */); // add any initialization checking you need
}
}
virtual ~accessible()
{
if (m_impl)
{
delete m_impl;
}
}
/* Other functionality of accessible, which may or may not use the impl class */
};
Approach (2) uses the Factory pattern to provide the appropriate implementation for the accessible class (which will provide the same interface as your base class). One of the main benefits here is that you get initialization during construction of accessible that is able to make use of virtual members of accessible_impl safely.
For C++, section 12.7, paragraph 3 of the Standard covers this case.
To summarize, this is legal. It will resolve to the correct function to the type of the constructor being run. Therefore, adapting your example to C++ syntax, you'd be calling BaseObject::LoadState(). You can't get to ChildObject::LoadState(), and trying to do so by specifying the class as well as the function results in undefined behavior.
Constructors of abstract classes are covered in section 10.4, paragraph 6. In brief, they may call member functions, but calling a pure virtual function in the constructor is undefined behavior. Don't do that.
If you have a class as shown in your post, which takes an XElement in the constructor, then the only place that XElement could have come from is the derived class. So why not just load the state in the derived class which already has the XElement.
Either your example is missing some fundamental information which changes the situation, or there's simply no need to chain back up to the derived class with the information from the base class, because it has just told you that exact information.
i.e.
public class BaseClass
{
public BaseClass(XElement defintion)
{
// base class loads state here
}
}
public class DerivedClass : BaseClass
{
public DerivedClass (XElement defintion)
: base(definition)
{
// derived class loads state here
}
}
Then your code's really simple, and you don't have any of the virtual method call problems.
For C++, read Scott Meyer's corresponding article :
Never Call Virtual Functions during Construction or Destruction
ps: pay attention to this exception in the article:
The problem would almost certainly
become apparent before runtime,
because the logTransaction function is
pure virtual in Transaction. Unless it
had been defined (unlikely, but
possible) the program wouldn't link: the linker would be unable to find the necessary implementation of Transaction::logTransaction.
Usually you can get around these issues by having a greedier base constructor. In your example, you're passing an XElement to LoadState. If you allow the state to be directly set in your base constructor, then your child class can parse the XElement prior to calling your constructor.
public abstract class BaseObject {
public BaseObject(int state1, string state2, /* blah, blah */) {
this.State1 = state1;
this.State2 = state2;
/* blah, blah */
}
}
public class ChildObject : BaseObject {
public ChildObject(XElement definition) :
base(int.Parse(definition["state1"]), definition["state2"], /* blah, blah */) {
}
}
If the child class needs to do a good bit of work, it can offload to a static method.
In C++ it is perfectly safe to call virtual functions from within the base-class - as long as they are non-pure - with some restrictions. However, you shouldn't do it. Better initialize objects using non-virtual functions, which are explicitly marked as being such initialization functions using comments and an appropriate name (like initialize). If it is even declared as pure-virtual in the class calling it, the behavior is undefined.
The version that's called is the one of the class calling it from within the constructor, and not some overrider in some derived class. This hasn't got much to-do with virtual function tables, but more with the fact that the override of that function might belong to a class that's not yet initialized. So this is forbidden.
In C# and Java, that's not a problem, because there is no such thing as a default-initialization that's done just before entering the constructor's body. In C#, the only things that are done outside the body is calling base-class or sibling constructors i believe. In C++, however, initializations done to members of derived classes by the overrider of that function would be undone when constructing those members while processing the constructor initializer list just before entering the constructors body of the derived class.
Edit: Because of a comment, i think a bit of clarification is needed. Here's an (contrived) example, let's assume it would be allowed to call virtuals, and the call would result in an activation of the final overrider:
struct base {
base() { init(); }
virtual void init() = 0;
};
struct derived : base {
derived() {
// we would expect str to be "called it", but actually the
// default constructor of it initialized it to an empty string
}
virtual void init() {
// note, str not yet constructed, but we can't know this, because
// we could have called from derived's constructors body too
str = "called it";
}
private:
string str;
};
That problem can indeed be solved by changing the C++ Standard and allowing it - adjusting the definition of constructors, object lifetime and whatnot. Rules would have to be made to define what str = ...; means for a not-yet constructed object. And note how the effect of it then depends on who called init. The feature we get does not justify the problems we have to solve then. So C++ simply forbids dynamic dispatch while the object is being constructed.