Why ADTs are good and Inheritance is bad? - java

I am a long time OO programmer and a functional programming newbie. From my little exposure algebraic data types only look like a special case of inheritance to me where you only have one level hierarchy and the super class cannot be extended outside the module.
So my (potentially dumb) question is: If ADTs are just that, a special case of inheritance (again this assumption may be wrong; please correct me in that case), then why does inheritance gets all the criticism and ADTs get all the praise?
Thank you.

I think that ADTs are complementary to inheritance. Both of them allow you to create extensible code, but the way the extensibility works is different:
ADTs make it easy to add new functionality for working with existing types
You can easily add new function that works with ADT, which has a fixed set of cases
On the other hand, adding new case requires modifying all functions
Inheritance makes it easy to add new types when you have fixed functionality
You can easily create inherited class and implement fixed set of virtual functions
On the other hand, adding a new virtual function requires modifying all inherited classes
Both object-oriented world and functional world developed their ways to allow the other type of extensibility. In Haskell, you can use typeclasses, in ML/OCaml, people would use dictionary of functions or maybe (?) functors to get the inhertiance-style extensibility. On the other hand, in OOP, people use the Visitor pattern, which is essentially a way to get something like ADTs.
The usual programming patterns are different in OOP and FP, so when you're programming in a functional language, you're writing the code in a way that requires the functional-style extensibility more often (and similarly in OOP). In practice, I think it is great to have a language that allows you to use both of the styles depending on the problem you're trying to solve.

Tomas Petricek has got the fundamentals exactly right; you might also want to look at Phil Wadler's writing on the "expression problem".
There are two other reasons some of us prefer algebraic data types over inheritance:
Using algebraic data types, the compiler can (and does) tell you if you have forgotten a case or if a case is redundant. This ability is especially useful when there are many more operations on things than there are kinds of thing. (E.g., many more functions than algebraic datatypes, or many more methods than OO constructors.) In an object-oriented language, if you leave a method out of a subclass, the compiler can't tell whether that's a mistake or whether you intended to inherit the superclass method unchanged.
This one is more subjective: many people have noted that if inheritance is used properly and aggressively, the implementation of an algorithm can easily be smeared out over a half a dozen classes, and even with a nice class browser at can be hard to follow the logic of the program (data flow and control flow). Without a nice class browser, you have no chance. If you want to see a good example, try implementing bignums in Smalltalk, with automatic failover to bignums on overflow. It's a great abstraction, but the language makes the implementation difficult to follow. Using functions on algebraic data types, the logic of your algorithm is usually all in one place, or if it is split up, its split up into functions which have contracts that are easy to understand.
P.S. What are you reading? I don't know of any responsible person who says "ADTs good; OO bad."

In my experience, what people usually consider "bad" about inheritance as implemented by most OO languages is not the idea of inheritance itself but the idea of subclasses modifying the behavior of methods defined in the superclass (method overriding), specifically in the presence of mutable state. It's really the last part that's the kicker. Most OO languages treat objects as "encapsulating state," which amounts to allowing rampant mutation of state inside of objects. So problems arise when, for example, a superclass expects a certain method to modify a private variable, but a subclass overrides the method to do something completely different. This can introduce subtle bugs which the compiler is powerless to prevent.
Note that in Haskell's implementation of subclass polymorphism, mutable state is disallowed, so you don't have such issues.
Also, see this objection to the concept of subtyping.

I am a long time OO programmer and a functional programming newbie. From my little exposure algebraic data types only look like a special case of inheritance to me where you only have one level hierarchy and the super class cannot be extended outside the module.
You are describing closed sum types, the most common form of algebraic data types, as seen in F# and Haskell. Basically, everyone agrees that they are a useful feature to have in the type system, primarily because pattern matching makes it easy to dissect them by shape as well as by content and also because they permit exhaustiveness and redundancy checking.
However, there are other forms of algebraic datatypes. An important limitation of the conventional form is that they are closed, meaning that a previously-defined closed sum type cannot be extended with new type constructors (part of a more general problem known as "the expression problem"). OCaml's polymorphic variants allow both open and closed sum types and, in particular, the inference of sum types. In contrast, Haskell and F# cannot infer sum types. Polymorphic variants solve the expression problem and they are extremely useful. In fact, some languages are built entirely on extensible algebraic data types rather than closed sum types.
In the extreme, you also have languages like Mathematica where "everything is an expression". Thus the only type in the type system forms a trivial "singleton" algebra. This is "extensible" in the sense that it is infinite and, again, it culminates in a completely different style of programming.
So my (potentially dumb) question is: If ADTs are just that, a special case of inheritance (again this assumption may be wrong; please correct me in that case), then why does inheritance gets all the criticism and ADTs get all the praise?
I believe you are referring specifically to implementation inheritance (i.e. overriding functionality from a parent class) as opposed to interface inheritance (i.e. implementing a consistent interface). This is an important distinction. Implementation inheritance is often hated whereas interface inheritance is often loved (e.g. in F# which has a limited form of ADTs).
You really want both ADTs and interface inheritance. Languages like OCaml and F# offer both.

Related

I want to know the meaning of compile-time decisions

What does it mean to say "with inheritance you're locked into compile-time decisions about code behavior".
I suggest this post from Donal Fellows on Programmers,
Some languages are pretty strongly static, and only allow the
specification of the inheritance relationship between two classes at
the time of definition of those classes. For C++, definition time is
practically the same as compilation time. (It's slightly different in
Java and C#, but not very much.) Other languages allow much more
dynamic reconfiguration of the relationship of classes (and class-like
objects in Javascript) to each other; some go as far as allowing the
class of an existing object to be modified, or the superclass of a
class to be changed. (This can cause total logical chaos, but can also
model real world nasties quite well.)
But it is important to contrast this to composition, where the
relationship between one object and another is not defined by their
class relationship (i.e., their type) but rather by the references
that each has in relation to the other. General composition is a very
powerful and ubiquitous method of arranging objects: when one object
needs to know something about another, it has a reference to that
other object and invokes methods upon it as necessary. As soon as you
start looking for this super-fundamental pattern, you'll find it
absolutely everywhere; the only way to avoid it is to put everything
in one object, which would be massively dumb! (There's also stricter
UML composition/aggregation, but that's not what the GoF book is
talking about there.)
One of the things about the composition relationship is that
particular objects do not need to be hard-bound to each other. The
pattern of concrete objects is very flexible, even in very static
languages like C++. (There is an upside to having things very static:
it is possible to analyse the code more closely and — at least
potentially — issue better code with less overhead.) To recap,
Javascript, as with many other dynamic languages, can pretend it
doesn't use compilation at all; just pretence, of course, but the
fundamental language model doesn't require transformation to a fixed
intermediate format (e.g., a “binary executable on disk”). That
compilation which is done is done at runtime, and can be easily redone
if things vary too much. (The fascinating thing is that such a good
job of compilation can be done, even starting from a very dynamic
basis…)
Some GoF patterns only really make sense in the context of a language
where things are fairly static. That's OK; it just means that not all
forces affecting the pattern are necessarily listed. One of the key
points about studying patterns is that it helps us be aware of these
important differences and caveats. (Other patterns are more universal.
Keep your eyes open for those.)

Is there any performance decrease in java, for extending classes for "no" reason?

I have a Vector3i class that is useful in a lot of situations, but I've found myself extending it to use the type system to prevent bugs.
For example, I might have an "ego-centric" vector3i that is local to an object in the world, and a world co-ordinate vector3i.
The two are naturally incompatible without conversion and are meaningless to each other.
It would be a good situation to use True Hungarian Notation but instead I'm extending the class and adding no new functionality.
Am I incurring a performance loss considering the JVM/Hotspots optimizations?
Inheritance is a powerful tool, but the power comes at a price. Inheritance has a lot of pitfalls and problems of its own. In particular, it breaks encapsulation and can lead to fragile code when not implemented with care.
The inheritance mechanism in Java has been developed continuously for over 15 years. You can rely on it to be fast and efficient. There are no significant performance-related reasons to pass on inheritance when it your data model calls for it.
For your case, it may make more sense to represent your functionalities by composition rather than by inheritance (in other words, instead of having ClassB extend ClassA, make ClassA an instance field within ClassB and then delegate method calls to the encapsulated object). You should at least consider it. Compared to inheritance, composition results in code that is more robust, and less fragile.

Duck typing and generic programming

I searched through SO for a while and I could not find a definite and general answer, only some contradictory and particular opinions. [1]
So I would like to know what is the relationship between duck typing and generic programming? ( DT < GP , DT == GP, DT > GP ) . By generic programming I refer to ,in particular, C++ templates or Java generics, but a general answer, related to the concepts, if it's possible, would be welcomed.
I know that generic programming will be handled at compile time, while duck typing will be handled at runtime, but appart of this I do not know how to position them.
Lastly, I do not want to start a debate, so I would prefer answers like reasons for, reasons against.
[1] What's the relationship between C++ template and duck typing?
I have encountered two different definitions of "Duck Typing". No doubt there are people who will tell you that one of them is "correct" and the other is "incorrect". I merely attempt to document that they're both used rather than tell you that they're both "correct", but personally I see nothing wrong with the broader meaning.
1) runtime-only typing. Type is a property of objects, not variables, and hence necessarily when you come to call a method on an object, or otherwise use properties that it has by virtue of its type, the presence or absence of that method is discovered at runtime[*]. So if it "looks like a duck and quacks like a duck" (i.e. if it turns out to have a quack() function) then it "is" a duck (anyway, you can treat it like one). By this definition of course C++ templates fall at the first hurdle, they use static typing.
2) A name used more generally for the principle that if it looks like a duck and quacks like a duck then it is a duck, to mean any setup in which interfaces are implicitly defined by the operations performed by the consumer of the interface, rather than interfaces being explicitly advertised by the producer (whatever implements the interface). By this definition, C++ templates do use a kind of duck-typing, but whether something "looks like a duck" is determined at compile time, not at runtime, based on its static type rather than its dynamic type. "Check at compile time, can this variable's static type quack?", not "check at runtime, can this object quack?".
The dispute seems to me in effect to be over whether Python "owns" the term, so that only Python-like type systems can be called duck typing, or whether others are free to appropriate the term to mean a similar concept in a different context. Whatever you think it should mean, it seems irresponsible to use a "jokey" term and demand that everyone understands the same formal definition from it. SO is not a suitable forum to tell you what a term "should" mean, unless there's an authoritative source that you're asking about, like a dictionary or any academic papers that define it formally. I think it can tell you what it's actually used to mean.
"Generic programming" can make use of duck typing or not, depending on the exact details. C++ generic containers do use "type 2" duck-typing, because for a general type in C++ it's not guaranteed that you can copy it, compare it, get a hash value, etc. Java generic containers don't, Object has enough methods already to make it hashable etc.
Conversely, I suspect that anything you do that uses duck typing, can reasonably be referred to as "generic programming". So I suppose, in the terms you asked for, GP > DT in that duck typing (either definition) is a strict subset of the vast range of stuff that can be called "generic".
[*] well, in some cases your dynamic language could have some static analysis that proves the case one way or the other, but the language demands the ability to defer this check until runtime for cases where the static analysis cannot conclusively say.
This is really a question of vocabulary. In it's most general sense,
generic programming is unrelated to the issue of compile-time vs.
run-time: it's a solution to a general problem. A good example of
where generic programming is run-time is Python, but it's also possible
to implement run-time generic programming in C++ (at a significant cost
in execution time).
Duck typing is an orthogonal concept, and is usually used to imply
runtime typing. Again, the most frequently cited modern example is
Python, but many, many languages, starting with Lisp, have used it in
the past. As a general rule, both C++ and Java have made an explicit
choice not to support duck typing. It's a trade-off: safety vs.
flexibility (or compile-time errors vs. run-time errors).
Java doesn't support duck typing in the language. It does support reflection which can achieve the same thing. It doesn't have any relationship with Java's Generics as far as I can see, in fact getting them to work together is a real pain.
For me, "duck typing" means there is no explicit conformance relationship. If something walks like a duck and talks like a duck, it can be treated like a duck, and doesn't need to explicitly declare that it is a duck. In C++ terms, it doesn't need to inherit from a Duck base class: inheritance is a way of declaring that one class conforms to the interface of another explicitly.
This notion is orthogonal to whether type-checking happens at run-time or compile-time. Languages like Smalltalk provide duck-typing that happens at run-time (and inheritance is used to reuse implementation, not to declare conformance of interface). C++ templates are a form of duck-typing that happens at compile-time.
And that last sentence is the answer to the question.

How could an idiomatic design of Serializable/Cloneable/... look like in Scala?

I wonder how much different these funcionality would look like (and how different the implementation would be), if Scala wouldn't (have to) follow Java's java.io.Serializable/java.lang.Cloneable (mostly to stay compatible with Java and the tools/ecosystem around it).
Because Scala is more simpler in language design, but enables more powerful implementation and abstraction possibilities, it is thinkable that Scala might take a different path compared to Java, if it wouldn't have to shoulder the Java-compatibility-burden.
I could imagine that a idiomatic implementation would use type classes or traits with (possibly) private fields/methods (not possible in Java interfaces?), maybe carrying some standard implementation?
Or are marker interfaces still the right choice in Scala?
Serialization and cloning are both kind of special because of mutability:
Serialization, because it has to deal with cycles in the object graph, and;
Cloning because... Well, the only reason to clone an object is to prevent the accidental spread of mutable state.
So, if you're willing to commit to a completely immutable domain model, you don't have object graphs as such anymore, you have object trees instead.
For a functionally-oriented approach to serialization, SBinary is what I'd probably try first. For cloning, Just Don't Do It. :)
Or are marker interfaces still the right choice in Scala?
Nope. They aren't even the right choice in Java. They should be annotations, not interfaces.
the best way to do this in ideomatic scala is to use implicits with the effect of a typeclass.
This is used for the Ordered trait
def max[A <% Ordered[A]](a:A,b:A);
means the same as:
def max[A](a:A,b:A)(implicit orderer: T => Ordered[A]);
It says you can use every type A as long as it can be threated as an Ordered[A].
this has several benefits you don´t have with the interface/inheritance approach of Java
You can add an implicit Ordered definition to an existing Type. You can´t do that with inheritance.
You can have several implementation of Ordered for one Type! This is even more flexible than Type classes in Haskell wich allow only one instance per type.
In conclusion scalas implicits used together with generics enable a very flexible approach to define Constraints on types.
It is the same with cloneable/serializable.
You may also want to look at the scalaz library which adds haskell like typeclasses to Scala such as Functor, Applicative and Monad and offers a rich set of implicits so that this concepts can also enrich the standart library.

Is there a name for a java method considered as separate from any particular class?

This is a terminological question, which makes it hard to ask!
Let me give an example. Suppose I am writing a symbolic-differentiation algorithm. I have an abstract class Exp that is a superclass of a bunch of expression types (sums, integrals, whatever). There is an abstract method derivative such that e.derivative() is supposed to be the derivative of the expression e. All the concrete subclasses of Exp (imagine a whole hierarchy here) implement a derivative method that captures knowledge of how to differentiate expressions of that type. A given method will typically compute the derivative of an expression by combining derivatives of subexpressions.
The details of the example are not important. The important thing is that all of these scattered methods can be considered pieces of one (recursive) algorithm. This is not true of all methods, but it's true of many of them (and the use of overloading to reuse the same method name for fundamentally different operations is considered a Bad Idea). The question is, what is the term for 'derivative,' considered as a single function? It's not a method; in another language it would be a function, and the various clauses (what to do with sums, what to do with integrals) would be in the same place. I don't care which approach or languaage is better, or whether that style can be used in Java. I just want to know what term to use for 'derivative' considered as a single function or procedure (the idea is not limited to functional programming, nor is recursion a key feature). When I tell someone what I did today, I'd like to say "I tried to implement a symbolic-differentation __, but every algorithm I thought of didn't work." What goes in the blank?
I assume the same issue comes up for other OO languages, but Java is the one I'm most familiar with. I'm so familiar with it that I'm pretty sure there is no standard term, but I thought I would ask our fine battery of experts here before jumping to that conclusion.
That sounds like "normal" subtype polymorphism. The subclasses/implementations do the work but the interface is defined in a base-type. This "scatter" method is in contrast to say, the Visitor Pattern ("as good as Java gets") or Pattern Matching (not in Java) or a big manky switch/if-else controller. I'm not sure I really would call it anything else as an aggregate.
Addendum: you may find Are Scala case-classes a failed experiment? a nice read. In particular, the comments which talk about "column" vs. "row" organization and the "difference of locality" each approach has:
...in OO, you divide by rows. Each row is a module, called a class. All the functions pertaining to that data variant are grouped together. This is a reasonable way of organizing things, and it's very common. The advantage is that's easy to add a data variant ... However the disadvantage is that it's hard to add new functions that vary by data type. You have to go through every class to add a new method.
I'm not sure if this is what you're looking for but I think I can answer this in terms of design pattern terminology. Your example sounds vaguely like the GoF Strategy Pattern. Here is an example of the Strategy Pattern implemented in Java.
On the contrary, I think that "method" is the standard term for this in the Java context.
A polymorphic function can be applied to values of different types. The function may be implemented by more than one Java method.

Categories