Virtual Mechanism in C++ and Java [duplicate] - java

In Java:
class Base {
public Base() { System.out.println("Base::Base()"); virt(); }
void virt() { System.out.println("Base::virt()"); }
}
class Derived extends Base {
public Derived() { System.out.println("Derived::Derived()"); virt(); }
void virt() { System.out.println("Derived::virt()"); }
}
public class Main {
public static void main(String[] args) {
new Derived();
}
}
This will output
Base::Base()
Derived::virt()
Derived::Derived()
Derived::virt()
However, in C++ the result is different:
Base::Base()
Base::virt() // ← Not Derived::virt()
Derived::Derived()
Derived::virt()
(See http://www.parashift.com/c++-faq-lite/calling-virtuals-from-ctors.html for C++ code)
What causes such a difference between Java and C++? Is it the time when vtable is initialized?
EDIT: I do understand Java and C++ mechanisms. What I want to know is the insights behind this design decision.

Both approaches clearly have disadvatages:
In Java, the call goes to a method which cannot use this properly because its members haven’t been initialised yet.
In C++, an unintuitive method (i.e. not the one in the derived class) is called if you don’t know how C++ constructs classes.
Why each language does what it does is an open question but both probably claim to be the “safer” option: C++’s way prevents the use of uninitialsed members; Java’s approach allows polymorphic semantics (to some extent) inside a class’ constructor (which is a perfectly valid use-case).

Well you have already linked to the FAQ's discussion, but that’s mainly problem-oriented, not going into the rationales, the why.
In short, it’s for type safety.
This is one of the few cases where C++ beats Java and C# on type safety. ;-)
When you create a class A, in C++ you can let each A constructor initialize the new instance so that all common assumptions about its state, called the class invariant, hold. For example, part of a class invariant can be that a pointer member points to some dynamically allocated memory. When each publicly available method preserves the class invariant, then it’s guaranteed to hold also on entry to each method, which greatly simplifies things – at least for a well-chosen class invariant!
No further checking is then necessary in each method.
In contrast, using two-phase initialization such as in Microsoft's MFC and ATL libraries you can never be quite sure whether everything has been properly initialized when a method (non-static member function) is called. This is very similar to Java and C#, except that in those languages the lack of class invariant guarantees comes from these languages merely enabling but not actively supporting the concept of a class invariant. In short, Java and C# virtual methods called from a base class constructor can be called down on a derived instance that has not yet been initialized, where the (derived) class invariant has not yet been established!
So, this C++ language support for class invariants is really great, helping do away with a lot of checking and a lot of frustrating perplexing bugs.
However, it makes a bit difficult to do derived class specific initialization in a base class constructor, e.g. doing general things in a topmost GUI Widget class’ constructor.
The FAQ item “Okay, but is there a way to simulate that behavior as if dynamic binding worked on the this object within my base class's constructor?” goes a little into that.
For a more full treatment of the most common case, see also my blog article “How to avoid post-construction by using Parts Factories”.

Regardless of how it's implemented, it's a difference in what the language definition says should happen. Java allows you to call functions on a derived object that hasn't been fully initialized (it has been zero-initialized, but its constructor has not run). C++ doesn't allow that; until the derived class's constructor has run, there is no derived class.

Hopefully this will help:
When your line new Derived() executes, the first thing that happens is the memory allocation. The program will allocate a chunk of memory big enough to hold both the members of Base and Derrived. At this point, there is no object. It's just uninitialized memory.
When Base's constructor has completed, the memory will contain an object of type Base, and the class invariant for Base should hold. There is still no Derived object in that memory.
During the construction of base, the Base object is in a partially-constructed state, but the language rules trust you enough to let you call your own member functions on a partially-constructed object. The Derived object isn't partially constructed. It doesn't exist.
Your call to the virtual function ends up calling the base class's version because at that point in time, Base is the most derived type of the object. If it were to call Derived::virt, it would be invoking a member function of Derived with a this-pointer that is not of type Derrived, breaking type safety.
Logically, a class is something that gets constructed, has functions called on it, and then gets destroyed. You can't call member functions on an object that hasn't been constructed, and you can't call member functions on an object after it's been destroyed. This is fairly fundamental to OOP, the C++ language rules are just helping you avoid doing things that break this model.

In Java, method invocation is based on object type, which is why it is behaving like that (I don't know much about c++).
Here your object is of type Derived, so jvm invokes method on Derived object.
If understand Virtual concept clearly, equivalent in java is abstract, your code right now is not really virtual code in java terms.
Happy to update my answer if something wrong.

Actually I want to know what's the insight behind this design decision
It may be that in Java, every type derives from Object, every Object is some kind of leaf type, and there's a single JVM in which all objects are constructed.
In C++, many types aren't virtual at all. Furthermore in C++, the base class and the subclass can be compiled to machine code separately: so the base class does what it does without whether it's a superclass of something else.

Constructors are not polymorphic in case of both C++ and Java languages, whereas a method could be polymorphic in both languages. This means, when a polymorphic method appears inside a constructor, the designers would be left with two choices.
Either strictly conform to the semantics on non-polymorphic
constructor and thus consider any polymorphic method invoked within a
constructor as non-polymorphic. This is how C++ does§.
Or, compromise
the strict semantics of non-polymorphic constructor and adhere to the
strict semantics of a polymorphic method. Thus polymorphic methods
from constructors are always polymorphic. This is how Java does.
Since none of the strategies offers or compromises any real benefits compared to other and yet Java way of doing it reduces lots of overhead (no need to differentiate polymorphism based on the context of constructors), and since Java was designed after C++, I would presume, the designer of Java opted for the 2nd option seeing the benefit of less implementation overhead.
Added on 21-Dec-2016
§Lest the statement “method invoked within a constructor as non-polymorphic...This is how C++ does” might be confusing without careful scrutiny of the context, I’m adding a formalization to precisely qualify what I meant.
If class C has a direct definition of some virtual function F and its ctor has an invocation to F, then any (indirect) invocation of C’s ctor on an instance of child class T will not influence the choice of F; and in fact, C::F will always be invoked from C’s ctor. In this sense, invocation of virtual F is less-polymorphic (compared to say, Java which will choose F based on T)
Further, it is important to note that, if C inherits definition of F from some parent P and has not overriden F, then C’s ctor will invoke P::F and even this, IMHO, can be determined statically.

Related

Could we have Polymorphism without forcing classes to implement an interface?

Assume that we have an interface called Animal that have two methods called move() and makeSound().
This means we can send the messages move() and makeSound() on a variable of type Animal, and we can only assign objects of classes that implement Animal to a variable of type Animal.
Now my question is, could Java have not forced classes that want to use Polymorphism to implement an interface?
For example, why didn't Java implement Polymorphism like the following:
We would just create an Animal interface and then we would be able to assign whatever object we want to a variable of type Animal as long as that object have the methods move() and makeSound(), for example:
Animal animal1;
/* The Java compiler will check if Dog have the methods move() and makeSound(), if yes then
compile, if no then show a compilation error */
animal1 = new Dog();
animal1.move();
animal1.makeSound();
Note: I took Java as an example, but I am talking in general about all OOP languages. Also, I know that we can have Polymorphism using a subclass that inherits from a superclass (but this is basically the same idea as using an interface).
There are a number of different ways to get polymorphism. The one you are most familiar with is inclusion polymorphism (also known as subtype polymorphism), where the programmer explicitly says "X is-a Y" via some sort of extends clause. You see this in Java and C#; both give you the choice of having such an is-a for both representation and API (extends), or only for API (implements).
There is also parametric polymorphism, which you have probably seen as generics: defining a family of types Foo<T> with a single declaration. You see this in Java/C#/Scala (generics), C++ (templates), Haskell (type classes), etc.
Some languages have "duck typing", where, rather than looking at the declaration ("X is-a Y"), they are willing to determine typing structurally. If a contract says "to be an Iterator, you have to have hasNext() and next() methods", then under this interpretation, any class that provides these two methods is an Iterator, regardless of whether it said so or not. This comports with the case you describe; this was a choice open to the Java designers.
Languages with pattern matching or runtime reflection can exhibit a form of ad-hoc polymorphism (also known as data-driven polymorphism), where you can define polymorphic behavior over unrelated types, such as:
int length(Object o) {
return switch (o) {
case String s -> s.length();
case Object[] os -> os.length;
case Collection c -> c.size();
...
};
}
Here, length is polymorphic over an ad-hoc set of types.
It is also possible to have an explicit declaration of "X is-a Y" without putting this in the declaration of X. Haskell's type classes do this, where, rather than X declaring "I'm a Y", there's a separate declaration of an instance that explicitly says "X is a Y (and here is how to map X functionality to Y functionality if it is not obvious to the compiler.)" Such instances are often called witnesses; it is a witness to the Y-hood of X. Clojure's protocols are similar, and Scala's implicit parameters play a similar role ("find me a witness to CanCopyFrom[A,B], or fail at compile time").
The point of all this is that there are many ways to get polymorphism, and some languages pick their favorite, others support more than one, etc.
If your question is why did Java choose explicit subtyping rather than duck typing, the answer is fairly clear: Java was a language designed for building large systems (as was C++) out of components, and components want strong checking at their boundaries. A loosey-goosey match because the two sides happen to have methods with the same name is a less reliable means of establishing programmer intent than an explicit declaration. Additionally, one of the core design principles of the Java language is "reading code is more important than writing code." It may be more work to declare "implements Iterator" (but not a lot more), but it makes it much more clear to readers what your design intent was.
So, this is a tradeoff of what we might now call "ceremony" for greater reliability and more clear capture of design intent.
The approach you're describing is called "structural subtyping", and it is not only possible, but actually in use; for example, it is used by Go and TypeScript.
Per the Go Programming Language Specification:
A variable of interface type can store a value of any type with a method set that is any superset of the interface. […]
A type implements any interface comprising any subset of its methods and may therefore implement several distinct interfaces. For instance, all types implement the empty interface:
interface{}
[link]
Per the TypeScript documentation:
Type compatibility in TypeScript is based on structural subtyping. Structural typing is a way of relating types based solely on their members. This is in contrast with nominal typing. Consider the following code:
interface Named {
name: string;
}
class Person {
name: string;
}
let p: Named;
// OK, because of structural typing
p = new Person();
In nominally-typed languages like C# or Java, the equivalent code would be an error because the Person class does not explicitly describe itself as being an implementer of the Named interface.
[link]
Note: I took Java as an example, but I am talking in general about all OOP languages.
I'm not sure it's possible to talk "in general about all OOP languages", because there are so many, and they work in many different ways. Your question makes sense for Java, but it wouldn't make sense for Go or TypeScript (since as you see, it has exactly the feature you'd be claiming it doesn't), nor for non-statically-typed OO as in Python or JavaScript (since they don't have the notion of "a variable of type Animal").
ETA: In a follow-up comment, you write:
Since it was possible for Java to not force classes to [explicitly] implement an interface, then why did Java force classes to [explicitly] implement an interface?
I can't say for certain; the first edition of the Java Language Specification [link] explicitly called this out, but didn't indicate the rationale:
It is not sufficient that the class happen to implement all the abstract methods of the interface; the class or one of its superclasses must actually be declared to implement the interface, or else the class is not considered to implement the interface. [p. 183]
However, I think the main reason was probably that interfaces are intended to have a meaning, which often goes beyond what's explicitly indicated in the method signatures. For example:
java.util.List, in addition to specifying various methods of its own, also specifies the behavior of equals and hashCode, instructing implementations to override the implementations provided by java.lang.Object and implement the specified behavior. If it were possible to "accidentally" implement java.util.List, then that instruction would be meaningless, because implementations might not even "know" that they were implementations.
java.io.Serializable has no methods at all; it's just a "marker" interface to tell the Java Serialization API that this class is OK with being serialized and deserialized. In Go, such an interface would be meaningless, because every type would automatically implement it.
Some other (IMHO less-significant) possible reasons:
Java method signatures are a bit more complicated than Go method signatures, in that they can also declare exceptions, and in that Java allows method overloading (multiple methods with the same name but different signatures). These features make it more likely that a class accidentally fails to implement an interface it's supposed to. When that happens, Java's current approach means that you get a single error-message in the place where you define the class, instead of hundreds of error-messages throughout your program in every place where you've written Animal animal = new Cat().
Interfaces are allowed to have static fields, which classes inherit (rather than needing to implement). I'm not sure how this would work if classes didn't explicitly indicate which interfaces they implement.
The current approach allows the subtyping relationship to be determined completely at compile-time; by contrast, if something like Animal animal = (Animal) obj; or if (obj instanceof Animal) were allowed, then the runtime would need to analyze obj's runtime-type on the fly to determine if it conforms to the Animal interface. (This also means that adding a method to the Animal interface could potentially cause runtime failures rather than compile-time failures.)
Even just within the compiler, the current approach may simplify some things by letting the compiler verify in one place that the class satisfies the interface, and then just use that fact everywhere that an implicit or explicit conversion appears. (This is related to my comment above about clearer error-messages.)
. . . but, again, this is just me speculating. I think I'm probably in the right ballpark, but a lot of things go into language design, and there could easily have been major considerations that would never occur to me.
Yes, this behavior is implemented with a structural type system. As expressed in a different answer, Go is one language which support structural typing.
By checking the language comparison by type system wiki, you can find other languages which support a structural type system.
Java uses a nominal type system, which requires types to be explicitly defined.
Asking why Java uses a nominal type system would be like asking why is statically typed. There are pros and cons to both, and language developers choose which strategies fit the philosophy of the language.

Why is "final" not allowed in Java 8 interface methods?

One of the most useful features of Java 8 are the new default methods on interfaces. There are essentially two reasons (there may be others) why they have been introduced:
Providing actual default implementations. Example: Iterator.remove()
Allowing for JDK API evolution. Example: Iterable.forEach()
From an API designer's perspective, I would have liked to be able to use other modifiers on interface methods, e.g. final. This would be useful when adding convenience methods, preventing "accidental" overrides in implementing classes:
interface Sender {
// Convenience method to send an empty message
default final void send() {
send(null);
}
// Implementations should only implement this method
void send(String message);
}
The above is already common practice if Sender were a class:
abstract class Sender {
// Convenience method to send an empty message
final void send() {
send(null);
}
// Implementations should only implement this method
abstract void send(String message);
}
Now, default and final are obviously contradicting keywords, but the default keyword itself would not have been strictly required, so I'm assuming that this contradiction is deliberate, to reflect the subtle differences between "class methods with body" (just methods) and "interface methods with body" (default methods), i.e. differences which I have not yet understood.
At some point of time, support for modifiers like static and final on interface methods was not yet fully explored, citing Brian Goetz:
The other part is how far we're going to go to support class-building
tools in interfaces, such as final methods, private methods, protected
methods, static methods, etc. The answer is: we don't know yet
Since that time in late 2011, obviously, support for static methods in interfaces was added. Clearly, this added a lot of value to the JDK libraries themselves, such as with Comparator.comparing().
Question:
What is the reason final (and also static final) never made it to Java 8 interfaces?
This question is, to some degree, related to What is the reason why “synchronized” is not allowed in Java 8 interface methods?
The key thing to understand about default methods is that the primary design goal is interface evolution, not "turn interfaces into (mediocre) traits". While there's some overlap between the two, and we tried to be accommodating to the latter where it didn't get in the way of the former, these questions are best understood when viewed in this light. (Note too that class methods are going to be different from interface methods, no matter what the intent, by virtue of the fact that interface methods can be multiply inherited.)
The basic idea of a default method is: it is an interface method with a default implementation, and a derived class can provide a more specific implementation. And because the design center was interface evolution, it was a critical design goal that default methods be able to be added to interfaces after the fact in a source-compatible and binary-compatible manner.
The too-simple answer to "why not final default methods" is that then the body would then not simply be the default implementation, it would be the only implementation. While that's a little too simple an answer, it gives us a clue that the question is already heading in a questionable direction.
Another reason why final interface methods are questionable is that they create impossible problems for implementors. For example, suppose you have:
interface A {
default void foo() { ... }
}
interface B {
}
class C implements A, B {
}
Here, everything is good; C inherits foo() from A. Now supposing B is changed to have a foo method, with a default:
interface B {
default void foo() { ... }
}
Now, when we go to recompile C, the compiler will tell us that it doesn't know what behavior to inherit for foo(), so C has to override it (and could choose to delegate to A.super.foo() if it wanted to retain the same behavior.) But what if B had made its default final, and A is not under the control of the author of C? Now C is irretrievably broken; it can't compile without overriding foo(), but it can't override foo() if it was final in B.
This is just one example, but the point is that finality for methods is really a tool that makes more sense in the world of single-inheritance classes (generally which couple state to behavior), than to interfaces which merely contribute behavior and can be multiply inherited. It's too hard to reason about "what other interfaces might be mixed into the eventual implementor", and allowing an interface method to be final would likely cause these problems (and they would blow up not on the person who wrote the interface, but on the poor user who tries to implement it.)
Another reason to disallow them is that they wouldn't mean what you think they mean. A default implementation is only considered if the class (or its superclasses) don't provide a declaration (concrete or abstract) of the method. If a default method were final, but a superclass already implemented the method, the default would be ignored, which is probably not what the default author was expecting when declaring it final. (This inheritance behavior is a reflection of the design center for default methods -- interface evolution. It should be possible to add a default method (or a default implementation to an existing interface method) to existing interfaces that already have implementations, without changing the behavior of existing classes that implement the interface, guaranteeing that classes that already worked before default methods were added will work the same way in the presence of default methods.)
In the lambda mailing list there are plenty of discussions about it. One of those that seems to contain a lot of discussion about all that stuff is the following: On Varied interface method visibility (was Final defenders).
In this discussion, Talden, the author of the original question asks something very similar to your question:
The decision to make all interface members public was indeed an
unfortunate decision. That any use of interface in internal design
exposes implementation private details is a big one.
It's a tough one to fix without adding some obscure or compatibility
breaking nuances to the language. A compatibility break of that
magnitude and potential subtlety would seen unconscionable so a
solution has to exist that doesn't break existing code.
Could reintroducing the 'package' keyword as an access-specifier be
viable. It's absence of a specifier in an interface would imply
public-access and the absence of a specifier in a class implies
package-access. Which specifiers make sense in an interface is unclear
- especially if, to minimise the knowledge burden on developers, we have to ensure that access-specifiers mean the same thing in both
class and interface if they're present.
In the absence of default methods I'd have speculated that the
specifier of a member in an interface has to be at least as visible as
the interface itself (so the interface can actually be implemented in
all visible contexts) - with default methods that's not so certain.
Has there been any clear communication as to whether this is even a
possible in-scope discussion? If not, should it be held elsewhere.
Eventually Brian Goetz's answer was:
Yes, this is already being explored.
However, let me set some realistic expectations -- language / VM
features have a long lead time, even trivial-seeming ones like this.
The time for proposing new language feature ideas for Java SE 8 has
pretty much passed.
So, most likely it was never implemented because it was never part of the scope. It was never proposed in time to be considered.
In another heated discussion about final defender methods on the subject, Brian said again:
And you have gotten exactly what you wished for. That's exactly what
this feature adds -- multiple inheritance of behavior. Of course we
understand that people will use them as traits. And we've worked hard
to ensure that the the model of inheritance they offer is simple and
clean enough that people can get good results doing so in a broad
variety of situations. We have, at the same time, chosen not to push
them beyond the boundary of what works simply and cleanly, and that
leads to "aw, you didn't go far enough" reactions in some case. But
really, most of this thread seems to be grumbling that the glass is
merely 98% full. I'll take that 98% and get on with it!
So this reinforces my theory that it simply was not part of the scope or part of their design. What they did was to provide enough functionality to deal with the issues of API evolution.
It will be hard to find and identify "THE" answer, for the resons mentioned in the comments from #EJP : There are roughly 2 (+/- 2) people in the world who can give the definite answer at all. And in doubt, the answer might just be something like "Supporting final default methods did not seem to be worth the effort of restructuring the internal call resolution mechanisms". This is speculation, of course, but it is at least backed by subtle evidences, like this Statement (by one of the two persons) in the OpenJDK mailing list:
"I suppose if "final default" methods were allowed, they might need rewriting from internal invokespecial to user-visible invokeinterface."
and trivial facts like that a method is simply not considered to be a (really) final method when it is a default method, as currently implemented in the Method::is_final_method method in the OpenJDK.
Further really "authorative" information is indeed hard to find, even with excessive websearches and by reading commit logs. I thought that it might be related to potential ambiguities during the resolution of interface method calls with the invokeinterface instruction and and class method calls, corresponding to the invokevirtual instruction: For the invokevirtual instruction, there may be a simple vtable lookup, because the method must either be inherited from a superclass, or implemented by the class directly. In contrast to that, an invokeinterface call must examine the respective call site to find out which interface this call actually refers to (this is explained in more detail in the InterfaceCalls page of the HotSpot Wiki). However, final methods do either not get inserted into the vtable at all, or replace existing entries in the vtable (see klassVtable.cpp. Line 333), and similarly, default methods are replacing existing entries in the vtable (see klassVtable.cpp, Line 202). So the actual reason (and thus, the answer) must be hidden deeper inside the (rather complex) method call resolution mechanisms, but maybe these references will nevertheless be considered as being helpful, be it only for others that manage to derive the actual answer from that.
I wouldn't think it is neccessary to specify final on a convienience interface method, I can agree though that it may be helpful, but seemingly the costs have outweight the benefits.
What you are supposed to do, either way, is to write proper javadoc for the default method, showing exactly what the method is and is not allowed to do. In that way the classes implementing the interface "are not allowed" to change the implementation, though there are no guarantees.
Anyone could write a Collection that adheres to the interface and then does things in the methods that are absolutely counter intuitive, there is no way to shield yourself from that, other than writing extensive unit tests.
We add default keyword to our method inside an interface when we know that the class extending the interface may or may not override our implementation. But what if we want to add a method that we don't want any implementing class to override? Well, two options were available to us:
Add a default final method.
Add a static method.
Now, Java says that if we have a class implementing two or more interfaces such that they have a default method with exactly same method name and signature i.e. they are duplicate, then we need to provide an implementation of that method in our class. Now in case of default final methods, we can't provide an implementation and we are stuck. And that's why final keyword isn't used in interfaces.

Two methods with the same name in java

I noticed that if I have two methods with the same name, the first one accepts SomeObject and the second one accepts an object extending SomeObject when I call the method with SomeOtherObject, it automatically uses the one that only accepts SomeObject. If I cast SomeOtherObject to SomeObject, the method that accepts SomeObject is used, even if the object is an instanceof SomeOtherObject. This means the method is selected when compiling. Why?
That's how method overload resolution in Java works: the method is selected at compile time.
For all of the ugly details, see the Java Language Specification §15.12.
This means the method is selected when compiling.
Yes you are correct. That is what it means.
Why?
I can think of four reasons why they designed Java this way:
This is consistent with the way that other statically typed OO languages that support overloading work. It is what people who come / came from the C++ world expect. (This was particularly important in the early days of Java ... though not so much now.). It is worth noting that C# handles overloading the same way.
It is efficient. Resolving method overloads at runtime (based on actual argument types) would make overloaded method calls expensive.
It gives more predictable (and therefore more easy to understand) behaviour.
It avoids the Brittle Base Class problem, where adding adding a new overloaded method in a base class causes unexpected problems in existing derived classes.
References:
http://blogs.msdn.com/b/ericlippert/archive/2004/01/07/virtual-methods-and-brittle-base-classes.aspx
Yes the function to be executed is decided at compile time! So JVM has no idea of the actual type of the Object at compile time. It only knows the type of the reference that points to the object given as argument to the function.
For more details you can look into Choosing the Most Specific Method in Java Specification.

Java implicit "this" parameter in method?

Within the programming language Java do method invocations on an object, work by implicitly passing a reference to the object to act on and working as static methods?
Details on how method invocation works can be found in the Java SE 7 JVM specification, section 3.7. For an instance method the this reference is passed as the first parameter. This reference is also used to select which method to invoke, since it might be overridden in a subclass, so it is a bit more complicated than a static method.
In short, no. That is how C++ was originally written, back when it was just a system of macros, but that was only because nothing existed (in C) like classes or static functions.
Java simply calls methods on objects. It has a shared piece of code that is the method, so in that sense it's static conceptually, but there is a bit that tells the modifiers of a method, and static is one of the bits, and it is not set for normal methods.

Why java.lang.Object is not abstract? [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Java: Rationale of the Object class not being declared abstract
Why is the Object class, which is base class of 'em all in Java, not abstract?
I've had this question for a really really long time and it is asked here purely out of curiosity, that's all. Nothing in my code or anybody's code is breaking because it is not abstract, but I was wondering why they made it concrete?
Why would anyone want an "instance" (and not its presence a.k.a. Reference) of this Object class? One case is a poor synchronization code which uses the instance of an Object for locking (at least I used it this way once.. my bad).
Is there any practical use of an "instance" of an Object class? And how does its instantiation fit in OOP? What would have happened if they had marked it abstract (of course after providing implementations to its methods)?
Without the designers of java.lang.Object telling us, we have to base our answers on opinion. There's a few questions which can be asked which may help clear it up.
Would any of the methods of Object benefit from being abstract?
It could be argued that some of the methods would benefit from this. Take hashCode() and equals() for instance, there would probably have been a lot less frustration around the complexities of these two if they had both been made abstract. This would require developers to figure out how they should be implementing them, making it more obvious that they should be consistent (see Effective Java). However, I'm more of the opinion that hashCode(), equals() and clone() belong on separate, opt-in abstractions (i.e. interfaces). The other methods, wait(), notify(), finalize(), etc. are sufficiently complicated and/or are native, so it's best they're already implemented, and would not benefit from being abstracted.
So I'd guess the answer would be no, none of the methods of Object would benefit from being abstract.
Would it be a benefit to mark the Object class as abstract?
Assuming all the methods are implemented, the only effect of marking Object abstract is that it cannot be constructed (i.e. new Object() is a compile error). Would this have a benefit? I'm of the opinion that the term "object" is itself abstract (can you find anything around you which can be totally described as "an object"?), so it would fit with the object-oriented paradigm. It is however, on the purist side. It could be argued that forcing developers to pick a name for any concrete subclass, even empty ones, will result in code which better expresses their intent. I think, to be totally correct in terms of the paradigm, Object should be marked abstract, but when it comes down to it, there's no real benefit, it's a matter of design preference (pragmatism vs. purity).
Is the practice of using a plain Object for synchronisation a good enough reason for it to be concrete?
Many of the other answers talk about constructing a plain object to use in the synchronized() operation. While this may have been a common and accepted practice, I don't believe it would be a good enough reason to prevent Object being abstract if the designers wanted it to be. Other answers have mentioned how we would have to declare a single, empty subclass of Object any time we wanted to synchronise on a certain object, but this doesn't stand up - an empty subclass could have been provided in the SDK (java.lang.Lock or whatever), which could be constructed any time we wanted to synchronise. Doing this would have the added benefit of creating a stronger statement of intent.
Are there any other factors which could have been adversely affected by making Object abstract?
There are several areas, separate from a pure design standpoint, which may have influenced the choice. Unfortunately, I do not know enough about them to expand on them. However, it would not suprise me if any of these had an impact on the decision:
Performance
Security
Simplicity of implementation of the JVM
Could there be other reasons?
It's been mentioned that it may be in relation to reflection. However, reflection was introduced after Object was designed. So whether it affects reflection or not is moot - it's not the reason. The same for generics.
There's also the unforgettable point that java.lang.Object was designed by humans: they may have made a mistake, they may not have considered the question. There is no language without flaws, and this may be one of them, but if it is, it's hardly a big one. And I think I can safely say, without lack of ambition, that I'm very unlikely to be involved in designing a key part of such a widely used technology, especially one that's lasted 15(?) years and still going strong, so this shouldn't be considered a criticism.
Having said that, I would have made it abstract ;-p
Summary
Basically, as far as I see it, the answer to both questions "Why is java.lang.Object concrete?" or (if it were so) "Why is java.lang.Object abstract?" is... "Why not?".
Plain instances of java.lang.Object are typically used in locking/syncronization scenarios and that's accepted practice.
Also - what would be the reason for it to be abstract? Because it's not fully functional in its own right as an instance? Could it really do with some abstract members? Don't think so. So the argument for making it abstract in the first place is non-existent. So it isn't.
Take the classic hierarchy of animals, where you have an abstract class Animal, the reasoning to make the Animal class abstract is because an instance of Animal is effectively an 'invalid' -by lack of a better word- animal (even if all its methods provide a base implementation). With Object, that is simply not the case. There is no overwhelming case to make it abstract in the first place.
From everything I've read, it seems that Object does not need to be concrete, and in fact should have been abstract.
Not only is there no need for it to be concrete, but after some more reading I am convinced that Object not being abstract is in conflict with the basic inheritance model - we should not be allowing abstract subclasses of a concrete class, since subclasses should only add functionality.
Clearly this is not the case in Java, where we have abstract subclasses of Object.
I can think of several cases where instances of Object are useful:
Locking and synchronization, like you and other commenters mention. It is probably a code smell, but I have seen Object instances used this way all the time.
As Null Objects, because equals will always return false, except on the instance itself.
In test code, especially when testing collection classes. Sometimes it's easiest to fill a collection or array with dummy objects rather than nulls.
As the base instance for anonymous classes. For example:
Object o = new Object() {...code here...}
I think it probably should have been declared abstract, but once it is done and released it is very hard to undo without causing a lot of pain - see Java Language Spec 13.4.1:
"If a class that was not abstract is changed to be declared abstract, then preexisting binaries that attempt to create new instances of that class will throw either an InstantiationError at link time, or (if a reflective method is used) an InstantiationException at run time; such a change is therefore not recommended for widely distributed classes."
From time to time you need a plain Object that has no state of its own. Although such objects seem useless at first sight, they still have utility since each one has different identity. Tnis is useful in several scenarios, most important of which is locking: You want to coordinate two threads. In Java you do that by using an object that will be used as a lock. The object need not have any state its mere existence is enough for it to become a lock:
class MyThread extends Thread {
private Object lock;
public MyThread(Object l) { lock = l; }
public void run() {
doSomething();
synchronized(lock) {
doSomethingElse();
}
}
}
Object lock = new Object();
new MyThread(lock).start();
new MyThread(lock).start();
In this example we used a lock to prevent the two threads from concurrently executing doSomethingElse()
If Object were abstract and we needed a lock we'd have to subclass it without adding any method nor fields just so that we can instantiate lock.
Coming to think about it, here's a dual question to yours: Suppose Object were abstract, will it define any abstract methods? I guess the answer is No. In such circumstances there is not much value to defining the class as abstract.
I don't understand why most seem to believe that making a fully functional class, which implements all of its methods in a use full way abstract would be a good idea.
I would rather ask why make it abstract? Does it do something it shouldn't? is it missing some functionality it should have? Both those questions can be answered with no, it is a fully working class on its own, making it abstract just leads to people implementing empty classes.
public class UseableObject extends AbstractObject{}
UseableObject inherits from abstract Object and surprise it can be implemented, it does not add any functionality and its only reason to exist is to allow access to the methods exposed by Object.
Also I have to disagree with the use in "poor" synchronisation. Using private Objects to synchronize access is safer than using synchronize(this) and safer as well as easier to use than the Lock classes from java util concurrent.
Seems to me there's a simple question of practicality here. Making a class abstract takes away the programmer's ability to do something, namely, to instantiate it. There is nothing you can do with an abstract class that you cannot do with a concrete class. (Well, you can declare abstract functions in it, but in this case we have no need to have abstract functions.) So by making it concrete, you make it more flexible.
Of course if there was some active harm that was done by making it concrete, that "flexibility" would be a drawback. But I can't think of any active harm done by making Object instantiable. (Is "instantiable" a word? Whatever.) We could debate whether any given use that someone has made of a raw Object instance is a good idea. But even if you could convince me that every use that I have ever seen of a raw Object instance was a bad idea, that still wouldn't prove that there might not be good uses out there. So if it doesn't hurt anything, and it might help, even if we can't think of a way that it would actually help at the moment, why prohibit it?
I think all of the answers so far forget what it was like with Java 1.0. In Java 1.0, you could not make an anonymous class, so if you just wanted an object for some purpose (synchronization or a null placeholder) you would have to go declare a class for that purpose, and then a whole bunch of code would have these extra classes for this purpose. Much more straight forward to just allow direct instantiation of Object.
Sure, if you were designing Java today you might say that everyone should do:
Object NULL_OBJECT = new Object(){};
But that was not an option in 1.0.
I suspect the designers did not know in which way people may use an Object may be used in the future, and therefore didn't want to limit programmers by enforcing them to create an additional class where not necessary, eg for things like mutexes, keys etc.
It also means that it can be instantiated in an array. In the pre-1.5 days, this would allow you to have generic data structures. This could still be true on some platforms (I'm thinking J2ME, but I'm not sure)
Reasons why Object needs to be concrete.
reflection
see Object.getClass()
generic use (pre Java 5)
comparison/output
see Object.toString(), Object.equals(), Object.hashCode(), etc.
syncronization
see Object.wait(), Object.notify(), etc.
Even though a couple of areas have been replaced/deprecated, there was still a need for a concrete parent class to provide these features to every Java class.
The Object class is used in reflection so code can call methods on instances of indeterminate type, i.e. 'Object.class.getDeclaredMethods()'. If Object were to be Abstract then code that wanted to participate would have to implement all abstract methods before client code could use reflection on them.
According to Sun, An abstract class is a class that is declared abstract—it may or may not include abstract methods. Abstract classes cannot be instantiated, but they can be subclassed. This also means you can't call methods or access public fields of an abstract class.
Example of an abstract root class:
abstract public class AbstractBaseClass
{
public Class clazz;
public AbstractBaseClass(Class clazz)
{
super();
this.clazz = clazz;
}
}
A child of our AbstractBaseClass:
public class ReflectedClass extends AbstractBaseClass
{
public ReflectedClass()
{
super(this);
}
public static void main(String[] args)
{
ReflectedClass me = new ReflectedClass();
}
}
This will not compile because it's invalid to reference 'this' in a constructor unless its to call another constructor in the same class. I can get it to compile if I change it to:
public ReflectedClass()
{
super(ReflectedClass.class);
}
but that only works because ReflectedClass has a parent ("Object") which is 1) concrete and 2) has a field to store the type for its children.
A example more typical of reflection would be in a non-static member function:
public void foo()
{
Class localClass = AbstractBaseClass.clazz;
}
This fails unless you change the field 'clazz' to be static. For the class field of Object this wouldn't work because it is supposed to be instance specific. It would make no sense for Object to have a static class field.
Now, I did try the following change and it works but is a bit misleading. It still requires the base class to be extended to work.
public void genericPrint(AbstractBaseClass c)
{
Class localClass = c.clazz;
System.out.println("Class is: " + localClass);
}
public static void main(String[] args)
{
ReflectedClass me = new ReflectedClass();
ReflectedClass meTwo = new ReflectedClass();
me.genericPrint(meTwo);
}
Pre-Java5 generics (like with arrays) would have been impossible
Object[] array = new Object[100];
array[0] = me;
array[1] = meTwo;
Instances need to be constructed to serve as placeholders until the actual objects are received.
I suspect the short answer is that the collection classes lost type information in the days before Java generics. If a collection is not generic, then it must return a concrete Object (and be downcast at runtime to whatever type it was previously).
Since making a concrete class into an abstract class would break binary compatibility (as noted upthread), the concrete Object class was kept. I would like to point out that in no case was it created for the sole purpose of sychronization; dummy classes work just as well.
The design flaw is not including generics from the beginning. A lot of design criticism is aimed at that decision and its consequences. [oh, and the array subtyping rule.]
Its not abstract because whenever we create a new class it extends Object class then if it was abstract you need to implement all the methods of Object class which is overhead... There are already methods implemented in that class...

Categories