Why all java methods are implicitly overridable? - java

In C++, I have to explicitly specify 'virtual' keyword to make a member function 'overridable', as there involves an overhead of creating virtual tables and vpointers, when a member function is made overridable (so every member function is implicitly not overridable for performance reasons).
It also allows a member function to be hidden (if not overridden) when a subclass provides a separate implementation with the same name and signature.
The same technique is used in C# as well. I am wondering why Java waved away from this behavior and made every method overridable by default and provided the ability to disable overriding behavior on explicit use of 'final' keyword.

The better question might be "Why does C# have non-virtual methods?" Or at the very least, why aren't they virtual by default with the option to flag them as non-virtual?
In C++, there is the idea (as Brian so nicely pointed out) that if you don't want it, you don't pay for it. The problem is that if you do want it, this usually means you end up paying through the nose for it. In most Java implementations, they are designed explicitly for lots of virtual calls; the vtable implementations tend to be fast, scarcely more expensive than non-virtual calls, meaning the primary advantage of non-virtual functions is lost. Furthermore, JIT compilers can inline virtual functions at runtime. As such, for efficiency reasons, there is very little reason actually to use non-virtual functions.
Thus, it largely comes down to the principle of least surprise. It tells us that all methods to behave the same way, not half of them being virtual and half of them being non-virtual. Since we need to have at least some virtual methods to achieve this polymorphism thing, it makes sense to have them all be virtual. Furthermore, having two methods with the same signature is just asking to shoot yourself in the foot.
Polymorphism also dictates that the object itself should have control over what it does. It's behavior should not be determinate on whether the client thinks it's a FooParent or a FooChild.
EDIT: So I'm being called on my assertions. This next paragraph is conjecture on my part, not a statement of fact.
An interesting side effect of all this is that Java programmers tend to use interfaces very heavily. Since the virtual method optimizations make the cost of interfaces essentially non-existent, they allow you to use a List (for example) instead of an ArrayList, and switch it out for a LinkedList at some later date with a simple one-line change and no additional penalty.
EDIT: I'll also pony up a couple sources. While not the original sources, they do come from Sun explaining some of the workings on HotSpot.
Inlining
VTable

Taken from here (#34)
There’s no virtual keyword in Java
because all non-static methods always
use dynamic binding. In Java, the
programmer doesn’t have to decide
whether to use dynamic binding. The
reason virtual exists in C++ is so you
can leave it off for a slight increase
in efficiency when you’re tuning for
performance (or, put another way, "If
you don’t use it, you don’t pay for
it"), which often results in confusion
and unpleasant surprises. The final
keyword provides some latitude for
efficiency tuning – it tells the
compiler that this method cannot be
overridden, and thus that it may be
statically bound (and made inline,
thus using the equivalent of a C++
non-virtual call). These optimizations
are up to the compiler.
A bit circular, perhaps.

So Java's rationale is probably something like this: the whole point of an object-oriented language is that things can be extended. So in terms of pure design, it really makes little sense to treat extensible as the "special case".
Remember that Java has the luxury of compiling at runtime. So some of the performance arguments in C++ compilation go out the window. In C++, if a class might be overridden, then the compiler has to take extra steps. In Java, there's no mystery about it: at any given moment in time, the JVM knows whether or not a particular method/class has been overridden or not, and that's essentially what counts.
Note that the final keyword is essentially about program design, not optimisation. The JVM doesn't need this information to see whether or not a class/method has been overridden!!

If the question is about to ask what is the better approach between java and C++/C# then it was already discussed in opposite direction in another thread, and many resource available on the net
Why C# implements methods as non-virtual by default?
http://www.artima.com/intv/nonvirtual.html
Recent introduction of #Override annotation and its wide adoption in new code, suggest that the exact answer to the question "Why all java methods are implicitly overridable?" is indeed because the designer made a mistake. (And they already fixed it)
Oh ! I'm going to get negative vote for this.

Java tries to move closer to a more dynamic language definition, where everything is an object and everything is a virtual method. It also wants to avoid ambiguity and hard to understand constructs, which it's designers viewed as a flaw in C++, therefore no operator overloading, and in this case no ability to have two public method signatures on one class hierarchy invoking different methods depending on the type of the variable referencing it.
C# is more concerned about the stability of subclasses and making sure that the subclasses behave predictably. C++ is concerned about performance.
Three different design priorities, leading to different choices.

I would say that in Java cost of virtual method is low compared to whole VM costs. In C++ it is significant cost, compared to assembly-like C background. Nobody would decide to make all methods called through pointer by default as result of C to C++ migration. It's too big change.

Related

Can a language ever have compile-time checking but the characteristics of dynamic typing?

Upon reading the following:
A lot of people define static typing and dynamic typing with respect
to the point at which the variable types are checked. Using this
analogy, static typed languages are those in which type checking is
done at compile-time, whereas dynamic typed languages are those in
which type checking is done at run-time.
This analogy leads to the analogy we used above to define static and
dynamic typing. I believe it is simpler to understand static and
dynamic typing in terms of the need for the explicit declaration of
variables, rather than as compile-time and run-time type checking.
Source
I was thinking that the two ways we define static and dynamic typing: compile-time checking and explicit type declaration are a bit like apples and oranges. A characteristic in all statically typed languages (from my knowledge) is the reference variables have a defined type. Can there be a language that has the benefits of compile-time checking (like Java) but also the ability to have variables unbounded to a specific type (like Python)?
Note: Not exactly type inference in a language like Java, because the variables are still assigned a type, just implicitly. This theoretical language wouldn't have reference types, so there would be no casting. I'm trying to avoid the use of "static typing" vs "dynamic typing" because of the confusion.
There could be, but should there be?
Imagine in hypothetical-pseudo-C++:
class Object
{
public:
virtual Object invoke(const char *name, std::list<Object> args);
virtual Object get_attr(const char *name);
virtual const Object &set_attr(const char *name, const Object &src);
};
And that you have a language that arranges:
to make Object class the root base class of all classes
syntactic sugar to turn blah.frabjugate() into blah.invoke("frabjugate") and
blah.x = 10 into blah.set_attr("x", 10)
Add to this something combining attributes of boost::variant and boost::any and you have a pretty good start. All the dynamicism (both good and runtime bugs bad) of Python with the eloquence and rigidity (yay!) of C++ or Java. With added run-time bloat and efficiency of hash-table lookups vs. call/jmp machine instructions.
In languages like Python, when you call blah.do_it() it has to do potentially multiple hash table lookups of the string "do_it" to find out if your instance blah or its class has a callable thing called "doit" every time it is called. This is the most extreme late-binding that could be imaged:
flarg.do_it() # replaces flarg.do_it()
flarg.do_it() # calls a different flarg.do_it()
You could have your hypothetical language give some control over when the binding occurs. C++-like standard methods are crudely static bound to the apparent reference type, not the real instance type. C++ virtual methods are late-bound to the object instance type. Python-like attributes and methods are extremely late bound to the current version of the object instance.
I think you could definitely program in a strong static typed language in a dynamic style, just as you could build an interpreter in a language like C++ or Java. Some syntax hooks could make it look a little more seamless. But maybe you could do the same in reverse: maybe a Python decorator that automatically checks argument types, or a MetaClass that does it at compile time? [no, I don't think this is possible...]
I think you should view it as a union of features. but you'd get both the best and the worst of both worlds...
Can there be a language that has the benefits of compile-time checking (like Java) but also the ability to have variables unbounded to a specific type (like Python)?
Actually mostly language have support for both, so yes. The difference is which form is preferred/easier and generally used. Java prefers static types but also supports dynamic casts and reflection.
This theoretical language wouldn't have reference types, so there would be no casting.
You have to consider that language also need to perform reasonably well so you have to consider how they will be implemented. You could have a super type but this makes optimisation very hard and you code will most likely either run slowly or use much more resources.
The more popular languages tend to make pragmatic implementation choices. They are not purely one type or another and are willing to borrow styles even if they don't handle them as cleanly as a "pure" language.
what exactly do they allow the compiler or programmer to do that dynamic types can't?
It is generally accepted that the quicker you find a bug, the cheaper it is to fix. When you first start programming, the cost of maintenance isn't high in your mind, but once you have much more experience you will realise that a successful project costs far more to maintain than it did to develop and fixing long standing bugs can be really costly.
static languages have two advantages
you pick up bugs sooner rather than later. The sooner the better. With dynamic languages you might never discover a bug if the code is never run.
the cost of maintenance is easier. Static languages make clearer the assumption made when the code was first written and are more likely to detect issues if you don't have enough test coverage (btw, you never have enough test coverage)
No you cannot. The difference here boils down to early binding versus late binding. Early binding means matching everything up on the binary level upfront, fixing it in code. The result is rigid, type-safe and fast code. Late binding means there is some kind of runtime interpretation involved. This results in flexiblility (potentially unsafe) at the cost of performance.
The two approaches are different on a technical level (compilation versus interpretation) and the programmer would have to choose which is desired when, which would defeat the benefit of having both in the first place.
In languages that use a (common) language runtime however you do get some of what you are asking for through reflection. But it is organized differently and still type-safe. It is not the implicit kind of binding you refer to but requires a bit of work and awareness from the programmer.
As far as what is possible with static types that is impossible with dynamic types: nothing. They are both Turing complete
The value of static types is finding bugs early. In Python, something as simple as a misspelled name isn't caught until you run the program, and even then only if the line of code with the misspelling is run.
class NuclearReactor():
def turn_power_off(self):
...
def shut_down_cleanly(self):
self.turn_power_of()

Do private functions use more or less computer resources than public ones?

Computer resources being RAM, possessing power, and disk space. I am just curious, even though it is more or less by a tiny itty-bitty amount.
It could, in theory, be a hair faster in some cases. In practice, they're equally fast.
Non-static, non-public methods are invoked using the invokevirtual bytecode op. This opcode requires the JVM to dynamically look up the actual's method resolution: if you have a call that's statically compiled to AbstractList::contains, should that resolve to ArrayList::contains, or LinkedList::contains, etc? What's more, the compiler can't just reuse the result of this compilation for next time; what if the next time that myList.contains(val) gets called, it's on a different implementation? So, the compiler has to do at least some amount of checking, roughly per-invocation, for non-private methods.
Private methods can't be overridden, and they're invoked using invokespecial. This opcode is used for various kind of method calls that you can resolve just once, and then never change: constructors, call to super methods, etc. For instance, if I'm in ArrayList::add and I call super.add(value) (which doesn't happen there, but let's pretend it did), then the compiler can know for sure that this refers to AbstractList::add, since a class's super class can't ever change.
So, in very rough terms, an invokevirtual call requires resolving the method and then invoking it, while an invokespecial call doesn't require resolving the method (after the first time it's called -- you have to resolve everything at least once!).
This is covered in the JVM spec, section 5.4.3:
Resolution of the symbolic reference of one occurrence of an invokedynamic instruction does not imply that the same symbolic reference is considered resolved for any other invokedynamic instruction.
For all other instructions above, resolution of the symbolic reference of one occurrence of an instruction does imply that the same symbolic reference is considered resolved for any other non-invokedynamic instruction.
(empahsis in original)
Okay, now for the "but you won't notice the difference" part. The JVM is heavily optimized for virtual calls. It can do things like detecting that a certain site always sees an ArrayList specifically, and so "staticify" the List::add call to actually be ArrayList::add. To do this, it needs to verify that the incoming object really is the expected ArrayList, but that's very cheap; and if some earlier method call has already done that work in this method, it doesn't need to happen again. This is called a monomorphic call site: even though the code is technically polymorphic, in practice the list only has one form.
The JVM optimizes monomorphic call sites, and even bimorphic call sites (for instance, the list is always an ArrayList or a LinkedList, never anything else). Once it sees three forms, it has to use a full polymorphic dispatch, which is slower. But then again, at that point you're comparing apples to oranges: a non-private, polymorphic call to a private call that's monomorphic by definition. It's more fair to compare the two kinds of monomorphic calls (virtual and private), and in that case you'll probably find that the difference is minuscule, if it's even detectible.
I just did a quick JMH benchmark to compare (a) accessing a field directly, (b) accessing it via a public getter and (c) accessing it via a private getter. All three took the same amount of time. Of course, uber-micro benchmarks are very hard to get right, because the JIT can do such wonderful things with optimizations. Then again, that's kind of the point: The JIT does such wonderful things with optimizations that public and private methods are just as fast.
Do private functions use more or less computer resources than public ones?
No. The JVM uses the same resources regardless of the access modifier on individual fields or methods.
But, there is a far better reason to prefer private (or protected) beside resource utilization; namely encapsulation. Also, I highly recommend you read The Developer Insight Series: Part 1 - Write Dumb Code.
I am just curious, even though it is more or less by a tiny itty-bitty amount.
While it is good to be curious ... if you start taking this kind of thing into account when you are programming, then:
you are liable to waste a lot of time looking for micro-optimizations that are not needed,
your code is liable to be unmaintainable because you are sacrificing good design principles, and
you even risk making your code less efficient* than it would be if you didn't optimize.
* - It it can go like this. 1) You spend a lot of time tweaking your code to run fast on your test platform. 2) When you run on the production platform, you find that the hardware gives you different performance characteristics. 3) You upgrade the Java installation, and the new JVM's JIT compiler optimizes your code differently, or it has a bunch of new optimizations that are inhibited by your tweaks. 4) When you run your code on real-world workloads, you discover that the assumption that were the basis for your tweaking are invalid.

Scala closures compared to Java innerclasses -> final VS var

I've first asked this question about the use of final with anonymous inner classes in Java:
Why do we use final keyword with anonymous inner classes?
I'm actually reading the Scala book of Martin Odersky. It seems Scala simplifies a lot of Java code, but for Scala closures I could notice a significant difference.
While in Java we "simulate" closures with an anonymous inner class, capturing a final variable (which will be copied to live on the heap instead of the stack) , it seems in Scala we can create a closure which can capture a val, but also a var, and thus update it in the closure call!
So it is like we can use a Java anonymous innerclass without the final keyword!
I've not finished reading the book, but for now i didn't find enough information on this language design choice.
Can someone tell me why Martin Odersky, who really seems to take care of function's side effects, choose closures to be able to capture both val and var, instead of only val?
What are the benefits and drawbacks of Java and Scala implementations?
Thanks
Related question:
With Scala closures, when do captured variables start to live on the JVM heap?
An object can be seen a bag of closures that share access to the same environment and that environment is usually mutable. So why make closures generated from anonymous functions less powerful?
Also, other languages with mutable variables and anonymous functions work the same way. Principle of lease astonishment. Java is actually WEIRD in not allowing mutable variables to be captured by inner classes.
And sometimes they're just darn useful. For example a self modifying thunk to create your own variant of lazy or future processing.
Downsides? They have all the downsides of shared mutable state.
Here are some benefits and drawbacks.
There is a principle in language design that the cost of something should be apparent to the programmer. (I first saw this in Holt's Design and Definition of the Turing Language, but I forget the name he gave it.) Two things that look the same should cost the same. Two local vars should have similar costs. This is in Java's favour. Two local vars in Java are implemented the same and so cost the same regardless of whether one of them is mentioned in an inner class. Another point in Java's favour is that in most cases the captured variable is indeed final, so there is little cost to the programmer to be prevented from capturing nonfinal local vars. Also the insistence on final simplifies the compiler, since it means that all local variables are stack variables.
There is another principle of language design that says be orthogonal. If a val can be captured why not a var. As long as there is a sensible interpretation, why put in restrictions. When languages are not orthogonal enough, they seems perverse. On the other-hand, languages that have too much orthogonality may have complex (and hence buggy and/or late and/or few) implementations. For example Algol 68 had orthogonality in spades, but implementing it was not easy, meaning few implementations and little uptake. Pascal (designed at about the same time) had all sorts of inelegant restrictions that made writing compilers easier. The result was lots of implementations and lots of uptake.

How are java interfaces implemented internally? (vtables?)

C++ has multiple inheritance. The implementation of multiple inheritance at the assembly level can be quite complicated, but there are good descriptions online on how this is normally done (vtables, pointer fixups, thunks, etc).
Java doesn't have multiple implementation inheritance, but it does have multiple interface inheritance, so I don't think a straight forward implementation with a single vtable per class can implement that. How does java implement interfaces internally?
I realize that contrary to C++, Java is Jit compiled, so different pieces of code might be optimized differently, and different JVMs might do things differently. So, is there some general strategy that many JVMs follow on this, or does anyone know the implementation in a specific JVM?
Also JVMs often devirtualize and inline method calls in which case there are no vtables or equivalent involved at all, so it might not make sense to ask about actual assembly sequences that implement virtual/interface method calls, but I assume that most JVMs still keep some kind of general representation of classes around to use if they haven't been able to devirtualize everything. Is this assumption wrong? Does this representation look in any way like a C++ vtable? If so do interfaces have separate vtables and how are these linked with class vtables? If so can object instances have multiple vtable pointers (to class/interface vtables) like object instances in C++ can? Do references of a class type and an interface type to the same object always have the same binary value or can these differ like in C++ where they require pointer fixups?
(for reference: this question asks something similar about the CLR, and there appears to be a good explanation in this msdn article though that may be outdated by now. I haven't been able to find anything similar for Java.)
Edit:
I mean 'implements' in the sense of "How does the GCC compiler implement integer addition / function calls / etc", not in the sense of "Java class ArrayList implements the List interface".
I am aware of how this works at the JVM bytecode level, what I want to know is what kind of code and datastructures are generated by the JVM after it is done loading the class files and compiling the bytecode.
The key feature of the HotSpot JVM is inline caching.
This doesn't actually mean that the target method is inlined, but means that an assumption
is put into the JIT code that every future call to the virtual or interface method will target
the very same implementation (i.e. that the call site is monomorphic). In this case, a
check is compiled into the machine code whether the assumption actually holds (i.e. whether
the type of the target object is the same as it was last time), and then transfer control
directly to the target method - with no virtual tables involved at all. If the assertion fails, an attempt may be made to convert this to a megamorphic call site (i.e. with multiple possible types); if this also fails (or if it is the first call), a regular long-winded lookup is performed, using vtables (for virtual methods) and itables (for interfaces).
Edit: The Hotspot Wiki has more details on the vtable and itable stubs. In the polymorphic case, it still puts an inline cache version into the call site. However, the code actually is a stub that performs a lookup in a vtable, or an itable. There is one vtable stub for each vtable offset (0, 1, 2, ...). Interface calls add a linear search over an array of itables before looking into the itable (if found) at the given offset.

Why should virtual functions not be used excessively?

I just read that we should not use virtual function excessively. People felt that less virtual functions tends to have fewer bugs and reduces maintenance.
What kind of bugs and disadvantages can appear due to virtual functions?
I'm interested in context of C++ or Java.
One reason I can think of is virtual function may be slower than normal functions due to v-table lookup.
You've posted some blanket statements that I would think most pragmatic programmers would shrug off as being misinformed or misinterpreted. But, there do exist anti-virtual zealots, and their code can be just as bad for performance and maintenance.
In Java, everything is virtual by default. Saying you shouldn't use virtual functions excessively is pretty strong.
In C++, you must declare a function virtual, but it's perfectly acceptable to use them when appropriate.
I just read that we should not use virtual function excessively.
It's hard to define "excessively"... certainly "use virtual functions when appropriate" is good advice.
People felt that less virtual functions tends to have fewer bugs and reduces maintenance.
I'm not able to get what kind of bugs and disadvantages can appear due to virtual functions.
Poorly designed code is hard to maintain. Period.
If you're a library maintainer, debugging code buried in a tall class hierarchy, it can be difficult to trace where code is actually being executed, without the benefit of a powerful IDE, it's often hard to tell just which class overrides the behavior. It can lead to a lot of jumping around between files tracing inheritance trees.
So, there are some rules of thumb, all with exceptions:
Keep your hierarchies shallow. Tall trees make for confusing classes.
In c++, if your class has virtual functions, use a virtual destructor (if not, it's probably a bug)
As with any hierarchy, keep to a 'is-a' relationship between derived and base classes.
You have to be aware, that a virtual function may not be called at all... so don't add implicit expectations.
There's a hard-to-argue case to be made that virtual functions are slower. It's dynamically bound, so it's often the case. Whether it matters in most of the cases that its cited is certainly debatable. Profile and optimize instead :)
In C++, don't use virtual when it's not needed. There's semantic meaning involved in marking a function virtual - don't abuse it. Let the reader know that "yes, this may be overridden!".
Prefer pure virtual interfaces to a hierarchy that mixes implementation. It's cleaner and much easier to understand.
The reality of the situation is that virtual functions are incredibly useful, and these shades of doubt are unlikely coming from balanced sources - virtual functions have been widely used for a very long time. More newer languages are adopting them as the default than otherwise.
Virtual functions are slightly slower than regular functions. But that difference is so small as to not make a difference in all but the most extreme circumstances.
I think the best reason to eschew virtual functions is to protect against interface misuse.
It's a good idea to write classes to be open for extension, but there's such a thing as too open. By carefully planning which functions are virtual, you can control (and protect) how a class can be extended.
The bugs and maintenance problems appear when a class is extended such that it breaks the contract of the base class. Here's an example:
class Widget
{
private WidgetThing _thing;
public virtual void Initialize()
{
_thing = new WidgetThing();
}
}
class DoubleWidget : Widget
{
private WidgetThing _double;
public override void Initialize()
{
// Whoops! Forgot to call base.Initalize()
_double = new WidgetThing();
}
}
Here, DoubleWidget broke the parent class because Widget._thing is null. There's a fairly standard way to fix this:
class Widget
{
private WidgetThing _thing;
public void Initialize()
{
_thing = new WidgetThing();
OnInitialize();
}
protected virtual void OnInitialize() { }
}
class DoubleWidget : Widget
{
private WidgetThing _double;
protected override void OnInitialize()
{
_double = new WidgetThing();
}
}
Now Widget won't run into a NullReferenceException later.
Every dependency increases complexity of the code, and makes it more difficult to maintain. When you define your function as virtual, you create dependency of your class on some other code, that might not even exist at the moment.
For example, in C, you can easily find what foo() does - there's just one foo(). In C++ without virtual functions, it's slightly more complicated: you need to explore your class and its base classes to find which foo() we need. But at least you can do it deterministically in advance, not in runtime. With virtual functions, we can't tell which foo() is executed, since it can be defined in one the subclasses.
(Another thing is the performance issue that you mentioned, due to v-table).
I suspect you misunderstood the statement.
Excessively is a very subjective term, I think that in this case it meant "when you don't need it", not that you should avoid it when it can be useful.
In my experience, some students, when they learn about virtual functions and get burned the first time by forgetting to make a function virtual, think that it is prudent to simply make every function virtual.
Since virtual functions do incur a cost on every method invocation (which in C++ cannot usually be avoided because of separate compilation), you are essentially paying now for every method call and also preventing inlining. Many instructors discourage students from doing this, though the term "excessive" is a very poor choice.
In Java, a "virtual" behavior (dynamic dispatching) is the default. However, The JVM can optimize things on the fly, and could theoretically eliminate some of the virtual calls when the target identity is clear. In additional, final methods or methods in final classes can often be resolved to a single target as well at compile time.
In C++: --
Virtual functions have a slight performance penalty. Normally it is too small to make any difference but in a tight loop it might be significant.
A virtual function increases the size of each object by one pointer. Again this is typically insignificant, but if you create millions of small objects it could be a factor.
Classes with virtual functions are generally meant to be inherited from. The derived classes may replace some, all or none of the virtual functions. This can create additional complexity and complexity is the programmers mortal enemy. For example, a derived class may poorly implement a virtual function. This may break a part of the base class that relies on the virtual function.
Now let me be clear: I am not saying "don't use virtual functions". They are a vital and important part of C++. Just be aware of the potential for complexity.
We recently had a perfect example of how misuse of virtual functions introduces bugs.
There is a shared library that features a message handler:
class CMessageHandler {
public:
virtual void OnException( std::exception& e );
///other irrelevant stuff
};
the intent is that you can inherit from that class and use it for custom error handling:
class YourMessageHandler : public CMessageHandler {
public:
virtual void OnException( std::exception& e ) { //custom reaction here }
};
The error handling mechanism uses a CMessageHandler* pointer, so it doesn't care of the actual type of the object. The function is virtual, so whenever an overloaded version exists the latter is called.
Cool, right? Yes, it was until the developers of the shared library changed the base class:
class CMessageHandler {
public:
virtual void OnException( const std::exception& e ); //<-- notice const here
///other irrelevant stuff
};
... and the overloads just stopped working.
You see what happened? After the base class was changed the overloads stopped to be the overloads from C++ point of view - they became new, other, unrelated functions.
The base class had the default implementation not marked as pure virtual, so the derived classes were not forced to overload the default implementation. And finally the functon was only called in case of error handling which isn't used every here and there. So the bug was silently introduced and lived unnoticed for a quite long period.
The only way to eliminate it once and for all was to do a search on all the codebase and edit all the relevant pieces of code.
I dont know where you read that, but imho this is not about performance at all.
Maybe its more about "prefer composition about inheritance" and problems which can occur if your classes/methods are not final (im talking mostly java here) but not really designed for reuse. There are many things which can go really wrong:
Maybe you use virtual methods in your
constructor - once theyre overridden,
your base class calls the overridden
method, which may use ressources
initialized in the subclass
constructor - which runs later (NPE rises).
Imagine an add and an addAll method
in a list class. addAll calls add
many times and both are virtual.
Somebody may override them to count
how many items have been added at
all. If you dont document that addAll
calls add, the developer may (and
will) override both add and addAll
(and add some counter++ stuff to
them). But now, if you youse addAll,
each item is count twice (add and
addAll) which leads to incorrect
results and hard to find bugs.
To sum this up, if you dont design your class for being extended (provide hooks, document some of the important implementation things), you shouldnt allow inheritance at all because this can lead to mean bugs. Also its easy to remove a final modifier (and maybe redesign it for reuseability) from one of your classes if needed, but its impossible to make a non-final class (where subclassing lead to errors) final because others may have subclassed it already.
Maybe it was really about performance, then im at least off topic. But if it wasnt, there you have some good reasons not to make your classes extendable if you dont really need it.
More information about stuff like that in Blochs Effective Java (this particular post was written a few days after I read item 16 ("prefer composition over inheritance") and 17 ("design and document for inheritance or else prohibit it") - amazing book.
I worked sporadically as a consultant on the same C++ system over a period of about 7 years, checking on the work of about 4-5 programmers. Every time I went back the system had gotten worse and worse. At some point somebody decided to remove all the virtual functions and replace them with a very obtuse factory/RTTI-based system that essentially did everything the virtual functions were already doing but worse, more expensively, thousands of lines more code, lots of work, lots of testing, ... Completely and utterly pointless, and clearly fear-of-the-unknown-driven.
They had also hand-written dozens of copy constructors, with errors, when the compiler would have produced them automatically, error-free, with about three exceptions where a hand-written version was required.
Moral: don't fight the language. It gives you things: use them.
The virtual table gets created for each class, having virtual functions or deriving from a class containing virtual functions. This consumes more than usual space.
The compiler needs to silently insert extra code for ensuring that the late binding takes place instead of the early binding. This consumes more than usual time.
In Java, there is no virtual keyword, but all methods (functions) are virtual, except the ones marked as final, static methods and private instance methods. Using virtual functions is not a bad practice at all, but because generally they cannot be resolved in compile-time, and compiler can't perform optimizations on them, they tend to be a little slower. The JVM has to figure out at run-time, which is the exact method that needs to be called. Note that this is not a big problem by any means, and you should consider it only if your goal is to create a very high-performance application.
For example, one of the biggest optimizations in Apache Spark 2 (which runs on JVM) was to reduce number of virtual function dispatches, to gain a better performance.

Categories