Dependency injection: templates (/generics) or virtual functions? - java

This is a question of curiosity about accepted coding practices. I'm (primarily) a Java developer, and have been increasingly making efforts to unit test my code. I've spent some time looking at how to write the most testable code, paying particular attention to Google's How to write untestable code guide (well worth a look, if you haven't seen it).
Naturally, I was arguing recently with a more C++-oriented friend about the advantages of each language's inheritance model, and I thought I'd pull out a trump card by saying how much harder C++ programmers made it to test their code by constantly forgetting the virtual keyword (for C++ers - this is the default in Java; you get rid of it using final).
I posted a code example that I thought would demonstrate the advantages of Java's model quite well (the full thing is over on GitHub). The short version:
class MyClassForTesting {
private final Database mDatabase;
private final Api mApi;
void myFunctionForTesting() {
for (User u : mDatabase.getUsers()) {
mRemoteApi.updateUserData(u);
}
}
MyClassForTesting ( Database usersDatabase, Api remoteApi) {
mDatabase = userDatabase;
mRemoteApi = remoteApi;
}
}
Regardless of the quality of what I've written here, the idea is that the class needs to make some (potentially quite expensive) calls to a database, and some API (maybe on a remote web server). myFunctionForTesting() doesn't have a return type, so how do you unit test this? In Java, I think the answer isn't too difficult - we mock:
/*** Tests ***/
/*
* This will record some stuff and we'll check it later to see that
* the things we expect really happened.
*/
ActionRecorder ar = new ActionRecorder();
/** Mock up some classes **/
Database mockedDatabase = new Database(ar) {
#Override
public Set<User> getUsers() {
ar.recordAction("got list of users");
/* Excuse my abuse of notation */
return new Set<User>( {new User("Jim"), new User("Kyle")} );
}
Database(ActionRecorder ar) {
this.ar = ar;
}
}
Api mockApi = new Api() {
#Override
public void updateUserData(User u) {
ar.recordAction("Updated user data for " + u.name());
}
Api(ActionRecorder ar) {
this.ar = ar;
}
}
/** Carry out the tests with the mocked up classes **/
MyClassForTesting testObj = new MyClassForTesting(mockDatabase, mockApi);
testObj.myFunctionForTesting();
// Check that it really fetches users from the database
assert ar.contains("got list of users");
// Check that it is checking the users we passed it
assert ar.contains("Updated user data for Jim");
assert ar.contains("Updated user data for Kyle");
By mocking up these classes, we inject the dependencies with our own light-weight versions that we can make assertions on for unit testing, and avoid making expensive, time-consuming calls to database/api-land. The designers of Database and Api don't have to be too aware that this is what we're going to do, and the designer of MyClassForTesting certainly doesn't have to know! This seems (to me) like a pretty good way to do things.
My C++ friend, however, retorted that this was a dreadful hack, and there's a good reason C++ won't let you do this! He then presented a solution based on Generics, which does much the same thing. For brevity's sake, I'll just list a part of the solution he gave, but again you can find the whole thing over on Github.
template<typename A, typename D>
class MyClassForTesting {
private:
A mApi;
D mDatabase;
public MyClassForTesting(D database, A api) {
mApi = api;
mDatabase = database;
}
...
};
Which would then be tested much like before, but with the important bits that get replaced shown below:
class MockDatabase : Database {
...
}
class MockApi : Api {
...
}
MyClassForTesting<MockApi, MockDatabase>
testingObj(MockApi(ar), MockDatabase(ar));
So my question is this: What's the preferred method? I always thought the polymorphism-based approach was better - and I see no reason it wouldn't be in Java - but is it normally considered better to use Generics than Virtualise everything in C++? What do you do in your code (assuming you do unit test) ?

I'm probably biased, but I'd say the C++ version is better. Among other things, polymorphism carries some cost. In this case, you're making your users pay that cost, even though they receive no direct benefit from it.
If, for example, you had a list of polymorphic objects, and want to manipulate all of them via the base class, that would justify using polymorphism. In this case, however, the polymorphism is being used for something the user never even sees. You've built in the ability to manipulate polymorphic objects, but never really used it -- for testing you'll only have mock objects, and for real use you'll only have real objects. There will never be a time that you have (for example) an array of database objects, some of which are mock databases and others of which are real databases.
This is also much more than just an efficiency issue (or at least a run-time efficiency issue). The relationships in your code should be meaningful. When somebody sees (public) inheritance, that should tell them something about the design. As you've outlined it in Java, however, the public inheritance relationship involved is basically a lie -- i.e. what he should know from it (that you're dealing with polymorphic descendants) is an outright falsehood. The C++ code, by contrast, correctly conveys the intent to the reader.
To an extent, I'm overstating the case there, of course. People who normally read Java are almost certainly well accustomed to the way inheritance is typically abused, so they don't see this as a lie at all. This is a bit of throwing out the baby with the bathwater though -- instead of seeing the "lie" for what it is, they've learned to completely ignore what inheritance really means (or just never knew, especially if they went to college where Java was the primary vehicle for teaching OOP). As I said, I'm probably somewhat biased, but to to me this makes (most) Java code much more difficult to understand. You basically have to be careful to ignore the basic principles of OOP, and get accustomed to its constant abuse.

Some key advice is "prefer composition to inheritence", which is what your MyClassForTesting has done with respect to the Database and Api. This is good C++ advice too: IIRC it is in Effective C++.
It is a bit rich for your friend to claim that using polymorphism is a "dreadful hack" but using templates is not. On what basis does (s)he claim that one is less hacky than the other? I see none, and I use both all the time in my C++ code.
I'd say the polymorphism approach (as you have done) is better. Consider that Database and Api might be interfaces. In that case you are explicitly declaring the API used by MyClassForTesting: someone can read the Api.java and Database.java files. And you are loosely coupling the modules: the Api and Database interfaces will naturally be the narrowest acceptable interfaces, much narrower than the public interface of any concerete class that implements them.

More importantly, you cannot create templated virtual functions. This makes it impossible to test functions in C++ which use templates, by using inheritance, and therefore testing by inheritance in C++ is unreliable as you cannot test all classes that way, and definitely not every use of a base class can be substituted with that of a derived class, especially w.r.t instantiating templates of them. Of course, templates introduce their own problems, but I think that's beyond the scope of the question.
You're throwing inheritance at the problem but really it's not the right solution- you only need to change between the mock and the real at compile time, not at run time. This fundamental fact makes templates the better option.
In C++, we don't forget the virtual keyword, we just don't need it, because run-time polymorphism should only occur when you need to vary the type at run-time. Else, you're firing a rocket launcher at a nail.

Related

Java OOP Public vs Private vs Protected

I understand what public, private, and protected do. I know that you are supposed to use them to comply with the concept of Object Oriented Programming, and I know how to implement them in a program using multiple classes.
My question is: Why do we do this? Why shouldn't I have one class modifying the global variables of another class directly? And even if you shouldn't why are the protected, private, and public modifiers even necessary? It's as if programmers don't trust themselves not to do it, even though they are the ones writing the program.
Thanks in advance.
You're right, it's because we can't trust ourselves. Mutable state is a major factor in complexity of computer programs, it's too easy to build something that seems ok at first and later grows out of control as the system gets bigger. Restricting access helps to reduce the opportunities for objects' states to change in unpredictable ways. The idea is for objects to communicate with each other through well-defined channels, as opposed to tweaking each others' data directly. That way we have some hope of testing the individual objects and having some confidence in how they'll behave as part of a larger system.
Let me give a basic example (this is only for illustration):
class Foo {
void processBar() {
Bar bar = new Bar();
bar.value = 10;
bar.process();
}
}
class Bar {
public int value;
public void process() {
// Say some code
int compute = 10/value;
// Her you have to write some code to handle
// exception
}
}
Every thing looks good and you are happy. Now later you realized that other developers or your other apis that you are using to set the value are setting to 0 and this is leading to exception in your Bar.process() function.
Now according to above implementation there is no way you can restrain users from setting it to 0. Now look at below implementation.
class Foo {
void processBar() {
Bar bar = new Bar();
bar.setValue(0);
bar.process();
}
}
class Bar {
public int value;
public void setValue(int value) {
if(value == 0)
throw new IllegalArgumentException("value = 0 is not allowed");
this.value = value;
}
public void process() {
// Say some code
int compute = 10/value;
// No need to write exception handling code
// so in theory can give u better performance too
}
}
Not only you can now put check but also give a informative exception which can help figuring errors quickly and at early stage.
This is just one of the examples, the basics of OOP (Encapsulation, Abstraction etc.), help you standardize interface and hide the underneath implementation.
Keep in mind that the developer coding a given class may not be the only one using it. Teams of developers write software libraries, which in Java are commonly distributed as JARs, used by completely different teams of developers. If these standards weren't in place, it would be very difficult for others to know, at a minimum, what the intent was of any available variables / methods.
If I have a private/protected instance variable, for example, I may have a public "setter" method that checks for validity, preconditions, and performs other activities - all which would be bypassed if anyone was freely able to modify the instance variable directly.
Another good point is in the Java documentation / tutorial: http://docs.oracle.com/javase/tutorial/java/javaOO/accesscontrol.html :
Public fields tend to link you to a particular implementation and
limit your flexibility in changing your code.
Non-local behavior is difficult to reason about.
Minimizing surface area increases comprehension.
I don't trust myself to remember all behavioral side-effects.
I definitely don't trust "you" to understand all behavioral side-effects.
The less I expose the more flexibility I have to modify and extend.
My question is: Why do we do this?
Basically, because by restricting ourselves in this way, we make it easier ... for ourselves, and others who may need to read / modify the code in the future ... to understand the code, and the way that the various parts interact.
Most developers understand code by a mental process of abstraction; i.e. mentally drawing boundaries around bits of code, understanding each bit in isolation, and then understanding how each bit interacts with other bits. If any part of the code could potentially mess around with the "innards" of any other part of the code, then it makes it hard for the typical developer to understand what is going on.
This may not be a problem for you while you are writing the code, because you may be able to keep all of the complex interactions in your head while you create the code. But in a year or two's time, you will have forgotten a lot of the details. And other people never had the details in their heads to start with.
Why shouldn't I have one class modifying the global variables of another class directly?
Because it makes your code harder to understand; see above. The larger your codebase is, the more pronounced the problem will be.
Another point is that if you use over-use globals (statics actually) then you create problems if your code needs to be multi-threaded / reentrant, for unit testing, and if you need to reuse your code in other contexts.
And even if you shouldn't why are the protected, private, and public modifiers even necessary? It's as if programmers don't trust themselves not to do it, even though they are the ones writing the program.
It is not about trust. It is about expressing in the source code where the boundaries are.
If I write a class and declare a method or a field private, I know that I don't have to consider the problem of what happens if some other class calls it / accesses it / modifies it. If I'm reading someone elses code, I know that I can (initially) ignore the private parts when mapping the interactions and boundaries. The private and protected modifiers and package private just
provide different granularities of boundary.
(Or maybe it is about trust; i.e. not trusting ourselves to remember where the abstraction boundaries in our design are / were.)
There are basically two reasons:
1) There are interfaces in Java where security is required. For instance, when running a java applet on your box you want to assure that the applet can't access parts of the file system to which it's not authorized. Without enforcible security, an applet could reach into the Java security layer and modify its own authorities.
2) Even when everyone is "trusted", sometimes expediency trumps common sense, and programmers bypass APIs to access internal interfaces rather than getting the APIs enhanced (which admittedly can often take longer than practical). This creates problems both for stability and for upgrade compatibility.
(There's a legend, deep in the ancient history of computing, of an OS that was treated this way by the application programmers, to the extent that the programmers maintaining the OS were forced to make sure that certain code sections (not entry points, but actual internal code sequences) didn't change physical addresses when the OS was revised.)
Note that these problems existed before the OOP paradigm became common, and are some of the motivation for OOP. OOP isn't an arbitrary religious doctrine invented out of thin air, but is a set of principles that have been filtered through about 6 decades of programming experience.

from java to javascript: the object model

I'm trying to port an application I wrote in java to javascript (actually using coffeescript).
Now, I'm feeling lost.. what do you suggest to do to create class properties? Should I use getter/setters? I don't like to do this:
myObj.prop = "hello"
because I could use non existing properties, and it would be easy to mispell something..
How can I get javascript to be a bit more like java, with private, public final properties etc..? Any suggestion?
If you just translate your Java code into JavaScript, you're going to be constantly fighting JavaScript's object model, which is prototype-based, not class-based. There are no private properties on objects, no final properties unless you're using an ES5-compatible engine (you haven't mentioned what your target runtime environment is; browsers aren't use ES5-compatible, it'll be another couple of years), no classes at all in fact.
Instead, I recommend you thoroughly brief yourself on how object orientation actually works in JavaScript, and then build your application fully embracing how JavaScript does it. This is non-trivial, but rewarding.
Some articles that may be of use. I start with closures because really understanding closures is absolutely essential to writing JavaScript, and most "private member" solutions rely on closures. Then I refer to a couple of articles by Douglas Crockford. Crockford is required reading if you're going to work in JavaScript, even if you end up disagreeing with some of his conclusions. Then I point to a couple of articles specifically addressing doing class-like things.
Closures are not complicated - Me
Prototypical inheritance in JavaScript - Crockford
Private Members in JavaScript - Crockford
Simple, Efficient Supercalls in JavaScript - Me Includes syntactic sugar to make it easier to set up hierarchies of objects (it uses class-based terminology, but actually it's just prototypical inheritance), including calling "superclass" methods.
Private Members in JavaScript - Me Listing Crockford's solution and others
Mythical Methods - Me
You must remember this - Me
Addressing some of your specific questions:
what do you suggest to do to create class properties? Should I use getter/setters? I don't like to do this:
myObj.prop = "hello"
because I could use non existing properties, and it would be easy to mispell something..
I don't, I prefer using TDD to ensure that if I do have a typo, it gets revealed in testing. (A good code-completing editor will also be helpful here, though really good JavaScript code-completing editors are thin on the ground.) But you're right that getters and setters in the Java sense (methods like getFoo and setFoo) would make it more obvious when you're creating/accessing a property that you haven't defined in advance (e.g., through a typo) by causing a runtime error, calling a function that doesn't exist. (I say "in the Java sense" because JavaScript as of ES5 has a different kind of "getters" and "setters" that are transparent and wouldn't help with that.) So that's an argument for using them. If you do, you might look at using Google's Closure compiler for release builds, as it will inline them.
How can I get javascript to be a bit more like java, with private...
I've linked Crockford's article on private members, and my own which lists other ways. The very basic explanation of the Crockford model is: You use a variable in the context created by the call to your constructor function and a function created within that context (a closure) that has access to it, rather than an object property:
function Foo() {
var bar;
function Foo_setBar(b) {
bar = b;
}
function Foo_getBar() {
return bar;
}
this.setBar = Foo_setBar;
this.getBar = Foo_getBar;
}
bar is not an object property, but the functions defined in the context with it have an enduring reference to it. This is totally fine if you're going to have a smallish number of Foo objects. If you're going to have thousands of Foo objects you might want to reconsider, because each and every Foo object has its own two functions (really genuinely different Function instances) for Foo_getBar and Foo_setBar.
You'll frequently see the above written like this:
function Foo() {
var bar;
this.setBar = function(b) {
bar = b;
};
this.getBar = function() {
return bar;
};
}
Yes, it's briefer, but now the functions don't have names, and giving your functions names helps your tools help you.
How can I get javascript to be a bit more like java, with...public final properties
You can define a Java-style getter with no setter. Or if your target environment will be ES5-compliant (again, browsers aren't yet, it'll be another couple of years), you could use the new Object.defineProperty feature that allows you to set properties that cannot be written to.
But my main point is to embrace the language and environment in which you're working. Learn it well, and you'll find that different patterns apply than in Java. Both are great languages (I use them both a lot), but they work differently and lead to different solutions.
You can use module pattern to make private properties and public accessors as one more option.
This doesn't directly answer your question, but I would abandon the idea of trying to make the JavaScript app like Java. They really are different languages (despite some similarities in syntax and in their name). As a general statement, it makes sense to adopt the idioms of the target language when porting something.
Currently there are many choices for you , you can check dojo library. In dojo, you can code mostly like java programming
Class
Javascript doesn’t have a Class system like Java,dojo provide dojo.declare to define a functionality to simulate this. Check this page . There are field variable, constructor method, extend from other class.
JavaScript has a feature that constructor functions may return any object (not necesserily this). So, your constructor function could just return a proxy object, that allows access only to the public methods of your class. Using this method you can create real protected member, just like in Java (with inheritance, super() call, etc.)
I created a little library to streamline this method: http://idya.github.com/oolib/
Dojo is one option. I personally prefer Prototype. It also has a framework and API for creating classes and using inheritance in a more "java-ish" way. See the Class.create method in the API. I've used it on multiple webapps I've worked on.
I mainly agree with #Willie Wheeler that you shouldn't try too hard to make your app like Java - there are ways of using JavaScript to create things like private members etc - Douglas Crockford and others have written about this kind of thing.
I'm the author of the CoffeeScript book from PragProg. Right now, I use CoffeeScript as my primary language; I got fluent in JavaScript in the course of learning CoffeeScript. But before that, my best language was Java.
So I know what you're going through. Java has a very strong set of best practices that give you a clear idea of what good code is: clean encapsulation, fine-grained exceptions, thorough JavaDocs, and GOF design patterns all over the place. When you switch to JavaScript, that goes right out the window. There are few "best practices," and more of a vague sense of "this is elegant." Then when you start seeing bugs, it's incredibly frustrating—there are no compile-time errors, and far fewer, less precise runtime errors. It's like playing without a net. And while CoffeeScript adds some syntactic sugar that might look familiar to Java coders (notably classes), it's really no less of a leap.
Here's my advice: Learn to write good CoffeeScript/JavaScript code. Trying to make it look like Java is the path to madness (and believe me, many have tried; see: just about any JS code released by Google). Good JS code is more minimalistic. Don't use get/set methods; use exceptions sparingly; and don't use classes or design patterns for everything. JS is ultimately a more expressive language than Java is, and CoffeeScript even moreso. Once you get used to the feeling of danger that comes with it, you'll like it.
One note: JavaScripters are, by and large, terrible when it comes to testing. There are plenty of good JS testing frameworks out there, but robust testing is much rarer than in the Java world. So in that regard, there's something JavaScripters can learn from Java coders. Using TDD would also be a great way of easing your concerns about how easy it is to make errors that, otherwise, wouldn't get caught until some particular part of your application runs.

What does it mean to program to an interface?

I keep hearing the statement on most programming related sites:
Program to an interface and not to an Implementation
However I don't understand the implications?
Examples would help.
EDIT: I have received a lot of good answers even so could you'll supplement it with some snippets of code for a better understanding of the subject. Thanks!
You are probably looking for something like this:
public static void main(String... args) {
// do this - declare the variable to be of type Set, which is an interface
Set buddies = new HashSet();
// don't do this - you declare the variable to have a fixed type
HashSet buddies2 = new HashSet();
}
Why is it considered good to do it the first way? Let's say later on you decide you need to use a different data structure, say a LinkedHashSet, in order to take advantage of the LinkedHashSet's functionality. The code has to be changed like so:
public static void main(String... args) {
// do this - declare the variable to be of type Set, which is an interface
Set buddies = new LinkedHashSet(); // <- change the constructor call
// don't do this - you declare the variable to have a fixed type
// this you have to change both the variable type and the constructor call
// HashSet buddies2 = new HashSet(); // old version
LinkedHashSet buddies2 = new LinkedHashSet();
}
This doesn't seem so bad, right? But what if you wrote getters the same way?
public HashSet getBuddies() {
return buddies;
}
This would have to be changed, too!
public LinkedHashSet getBuddies() {
return buddies;
}
Hopefully you see, even with a small program like this you have far-reaching implications on what you declare the type of the variable to be. With objects going back and forth so much it definitely helps make the program easier to code and maintain if you just rely on a variable being declared as an interface, not as a specific implementation of that interface (in this case, declare it to be a Set, not a LinkedHashSet or whatever). It can be just this:
public Set getBuddies() {
return buddies;
}
There's another benefit too, in that (well at least for me) the difference helps me design a program better. But hopefully my examples give you some idea... hope it helps.
One day, a junior programmer was instructed by his boss to write an application to analyze business data and condense it all in pretty reports with metrics, graphs and all that stuff. The boss gave him an XML file with the remark "here's some example business data".
The programmer started coding. A few weeks later he felt that the metrics and graphs and stuff were pretty enough to satisfy the boss, and he presented his work. "That's great" said the boss, "but can it also show business data from this SQL database we have?".
The programmer went back to coding. There was code for reading business data from XML sprinkled throughout his application. He rewrote all those snippets, wrapping them with an "if" condition:
if (dataType == "XML")
{
... read a piece of XML data ...
}
else
{
.. query something from the SQL database ...
}
When presented with the new iteration of the software, the boss replied: "That's great, but can it also report on business data from this web service?" Remembering all those tedious if statements he would have to rewrite AGAIN, the programmer became enraged. "First xml, then SQL, now web services! What is the REAL source of business data?"
The boss replied: "Anything that can provide it"
At that moment, the programmer was enlightened.
An interface defines the methods an object is commited to respond.
When you code to the interface, you can change the underlying object and your code will still work ( because your code is agnostic of WHO do perform the job or HOW the job is performed ) You gain flexibility this way.
When you code to a particular implementation, if you need to change the underlying object your code will most likely break, because the new object may not respond to the same methods.
So to put a clear example:
If you need to hold a number of objects you might have decided to use a Vector.
If you need to access the first object of the Vector you could write:
Vector items = new Vector();
// fill it
Object first = items.firstElement();
So far so good.
Later you decided that because for "some" reason you need to change the implementation ( let's say the Vector creates a bottleneck due to excessive synchronization)
You realize you need to use an ArrayList instad.
Well, you code will break ...
ArrayList items = new ArrayList();
// fill it
Object first = items.firstElement(); // compile time error.
You can't. This line and all those line who use the firstElement() method would break.
If you need specific behavior and you definitely need this method, it might be ok ( although you won't be able to change the implementation ) But if what you need is to simply retrieve the first element ( that is , there is nothing special with the Vector other that it has the firstElement() method ) then using the interface rather than the implementation would give you the flexibility to change.
List items = new Vector();
// fill it
Object first = items.get( 0 ); //
In this form you are not coding to the get method of Vector, but to the get method of List.
It does not matter how do the underlying object performs the method, as long as it respond to the contract of "get the 0th element of the collection"
This way you may later change it to any other implementation:
List items = new ArrayList(); // Or LinkedList or any other who implements List
// fill it
Object first = items.get( 0 ); // Doesn't break
This sample might look naive, but is the base on which OO technology is based ( even on those language which are not statically typed like Python, Ruby, Smalltalk, Objective-C etc )
A more complex example is the way JDBC works. You can change the driver, but most of your call will work the same way. For instance you could use the standard driver for oracle databases or you could use one more sophisticated like the ones Weblogic or Webpshere provide . Of course it isn't magical you still have to test your product before, but at least you don't have stuff like:
statement.executeOracle9iSomething();
vs
statement.executeOracle11gSomething();
Something similar happens with Java Swing.
Additional reading:
Design Principles from Design Patterns
Effective Java Item: Refer to objects by their interfaces
( Buying this book the one of the best things you could do in life - and read if of course - )
My initial read of that statement is very different than any answer I've read yet. I agree with all the people that say using interface types for your method params, etc are very important, but that's not what this statement means to me.
My take is that it's telling you to write code that only depends on what the interface (in this case, I'm using "interface" to mean exposed methods of either a class or interface type) you're using says it does in the documentation. This is the opposite of writing code that depends on the implementation details of the functions you're calling. You should treat all function calls as black boxes (you can make exceptions to this if both functions are methods of the same class, but ideally it is maintained at all times).
Example: suppose there is a Screen class that has Draw(image) and Clear() methods on it. The documentation says something like "the draw method draws the specified image on the screen" and "the clear method clears the screen". If you wanted to display images sequentially, the correct way to do so would be to repeatedly call Clear() followed by Draw(). That would be coding to the interface. If you're coding to the implementation, you might do something like only calling the Draw() method because you know from looking at the implementation of Draw() that it internally calls Clear() before doing any drawing. This is bad because you're now dependent on implementation details that you can't know from looking at the exposed interface.
I look forward to seeing if anyone else shares this interpretation of the phrase in the OP's question, or if I'm entirely off base...
It's a way to separate responsibilities / dependancies between modules.
By defining a particular Interface (an API), you ensure that the modules on either side of the interface won't "bother" one another.
For example, say module 1 will take care of displaying bank account info for a particular user, and module2 will fetch bank account info from "whatever" back-end is used.
By defining a few types and functions, along with the associated parameters, for example a structure defining a bank transaction, and a few methods (functions) like GetLastTransactions(AccountNumber, NbTransactionsWanted, ArrayToReturnTheseRec) and GetBalance(AccountNumer), the Module1 will be able to get the needed info, and not worry about how this info is stored or calculated or whatever. Conversely, the Module2 will just respond to the methods call by providing the info as per the defined interface, but won't worry about where this info is to be displayed, printed or whatever...
When a module is changed, the implementation of the interface may vary, but as long as the interface remains the same, the modules using the API may at worst need to be recompiled/rebuilt, but they do not need to have their logic modified in anyway.
That's the idea of an API.
At its core, this statement is really about dependencies. If I code my class Foo to an implementation (Bar instead of IBar) then Foo is now dependent on Bar. But if I code my class Foo to an interface (IBar instead of Bar) then the implementation can vary and Foo is no longer dependent on a specific implementation. This approach gives a flexible, loosely-coupled code base that is more easily reused, refactored and unit tested.
Take a red 2x4 Lego block and attach it to a blue 2x4 Lego block so one sits atop the other. Now remove the blue block and replace it with a yellow 2x4 Lego block. Notice that the red block did not have to change even though the "implementation" of the attached block varied.
Now go get some other kind of block that does not share the Lego "interface". Try to attach it to the red 2x4 Lego. To make this happen, you will need to change either the Lego or the other block, perhaps by cutting away some plastic or adding new plastic or glue. Notice that by varying the "implementation" you are forced to change it or the client.
Being able to let implementations vary without changing the client or the server - that is what it means to program to interfaces.
An interface is like a contract between you and the person who made the interface that your code will carry out what they request. Furthermore, you want to code things in such a way that your solution can solve the problem many times over. Think code re-use. When you are coding to an implementation, you are thinking purely of the instance of a problem that you are trying to solve. So when under this influence, your solutions will be less generic and more focused. That will make writing a general solution that abides by an interface much more challenging.
Look, I didn't realize this was for Java, and my code is based on C#, but I believe it provides the point.
Every car have doors.
But not every door act the same, like in UK the taxi doors are backwards. One universal fact is that they "Open" and "Close".
interface IDoor
{
void Open();
void Close();
}
class BackwardDoor : IDoor
{
public void Open()
{
// code to make the door open the "wrong way".
}
public void Close()
{
// code to make the door close properly.
}
}
class RegularDoor : IDoor
{
public void Open()
{
// code to make the door open the "proper way"
}
public void Close()
{
// code to make the door close properly.
}
}
class RedUkTaxiDoor : BackwardDoor
{
public Color Color
{
get
{
return Color.Red;
}
}
}
If you are a car door repairer, you dont care how the door looks, or if it opens one way or the other way. Your only requirement is that the door acts like a door, such as IDoor.
class DoorRepairer
{
public void Repair(IDoor door)
{
door.Open();
// Do stuff inside the car.
door.Close();
}
}
The Repairer can handle RedUkTaxiDoor, RegularDoor and BackwardDoor. And any other type of doors, such as truck doors, limousine doors.
DoorRepairer repairer = new DoorRepairer();
repairer.Repair( new RegularDoor() );
repairer.Repair( new BackwardDoor() );
repairer.Repair( new RedUkTaxiDoor() );
Apply this for lists, you have LinkedList, Stack, Queue, the normal List, and if you want your own, MyList. They all implement the IList interface, which requires them to implement Add and Remove. So if your class add or remove items in any given list...
class ListAdder
{
public void PopulateWithSomething(IList list)
{
list.Add("one");
list.Add("two");
}
}
Stack stack = new Stack();
Queue queue = new Queue();
ListAdder la = new ListAdder()
la.PopulateWithSomething(stack);
la.PopulateWithSomething(queue);
Allen Holub wrote a great article for JavaWorld in 2003 on this topic called Why extends is evil. His take on the "program to the interface" statement, as you can gather from his title, is that you should happily implement interfaces, but very rarely use the extends keyword to subclass. He points to, among other things, what is known as the fragile base-class problem. From Wikipedia:
a fundamental architectural problem of object-oriented programming systems where base classes (superclasses) are considered "fragile" because seemingly safe modifications to a base class, when inherited by the derived classes, may cause the derived classes to malfunction. The programmer cannot determine whether a base class change is safe simply by examining in isolation the methods of the base class.
In addition to the other answers, I add more:
You program to an interface because it's easier to handle. The interface encapsulates the behavior of the underlying class. This way, the class is a blackbox. Your whole real life is programming to an interface. When you use a tv, a car, a stereo, you are acting on its interface, not on its implementation details, and you assume that if implementation changes (e.g. diesel engine or gas) the interface remains the same. Programming to an interface allows you to preserve your behavior when non-disruptive details are changed, optimized, or fixed. This simplifies also the task of documenting, learning, and using.
Also, programming to an interface allows you to delineate what is the behavior of your code before even writing it. You expect a class to do something. You can test this something even before you write the actual code that does it. When your interface is clean and done, and you like interacting with it, you can write the actual code that does things.
"Program to an interface" can be more flexible.
For example, we are writing a class Printer which provides print service. currently there are 2 class (Cat and Dog) need to be printed. So we write code like below
class Printer
{
public void PrintCat(Cat cat)
{
...
}
public void PrintDog(Dog dog)
{
...
}
...
}
How about if there is a new class Bird also needs this print service? We have to change Printer class to add a new method PrintBird(). In real case, when we develop Printer class, we may have no idea about who will use it. So how to write Printer? Program to an interface can help, see below code
class Printer
{
public void Print(Printable p)
{
Bitmap bitmap = p.GetBitmap();
// print bitmap ...
}
}
With this new Printer, everything can be printed as long as it implements Interface Printable. Here method GetBitmap() is just a example. The key thing is to expose an Interface not a implementation.
Hope it's helpful.
Essentially, interfaces are the slightly more concrete representation of general concepts of interoperation - they provide the specification for what all the various options you might care to "plug in" for a particular function should do similarly so that code which uses them won't be dependent on one particular option.
For instance, many DB libraries act as interfaces in that they can operate with many different actual DBs (MSSQL, MySQL, PostgreSQL, SQLite, etc.) without the code that uses the DB library having to change at all.
Overall, it allows you to create code that's more flexible - giving your clients more options on how they use it, and also potentially allowing you to more easily reuse code in multiple places instead of having to write new specialized code.
By programming to an interface, you are more likely to apply the low coupling / high cohesion principle.
By programming to an interface, you can easily switch the implementation of that interface (the specific class).
It means that your variables, properties, parameters and return types should have an interface type instead of a concrete implementation.
Which means you use IEnumerable<T> Foo(IList mylist) instead of ArrayList Foo(ArrayList myList) for example.
Use the implementation only when constructing the object:
IList list = new ArrayList();
If you have done this you can later change the object type maybe you want to use LinkedList instead of ArrayList later on, this is no problem since everywhere else you refer to it as just "IList"
It's basically where you make a method/interface like this: create( 'apple' ) where the method create(param) comes from an abstract class/interface fruit that is later implemented by concrete classes. This is different than subclassing. You are creating a contract that classes must fulfill. This also reduces coupling and making things more flexible where each concrete class implements it differently.
The client code remains unaware of the specific types of objects used and remains unaware of the classes that implement these objects. Client code only knows about the interface create(param) and it uses it to make fruit objects. It's like saying, "I don't care how you get it or make it I, just want you to give it to me."
An analogy to this is a set of on and off buttons. That is an interface on() and off(). You can use these buttons on several devices, a TV, radio, light. They all handle them differently but we don't care about that, all we care about is to turn it on or turn it off.
Coding to an interface is a philosophy, rather than specific language constructs or design patterns - it instructs you what is the correct order of steps to follow in order to create better software systems (e.g. more resilient, more testable, more scalable, more extendible, and other nice traits).
What it actually means is:
===
Before jumping to implementations and coding (the HOW) - think of the WHAT:
What black boxes should make up your system,
What is each box' responsibility,
What are the ways each "client" (that is, one of those other boxes, 3rd party "boxes", or even humans) should communicate with it (the API of each box).
After you figure the above, go ahead and implement those boxes (the HOW).
Thinking first of what a box' is and what its API, leads the developer to distil the box' responsibility, and to mark for himself and future developers the difference between what is its exposed details ("API") and it's hidden details ("implementation details"), which is a very important differentiation to have.
One immediate and easily noticeable gain is the team can then change and improve implementations without affecting the general architecture. It also makes the system MUCH more testable (it goes well with the TDD approach).
===
Beyond the traits I've mentioned above, you also save A LOT OF TIME going this direction.
Micro Services and DDD, when done right, are great examples of "Coding to an interface", however the concept wins in every pattern from monoliths to "serverless", from BE to FE, from OOP to functional, etc....
I strongly recommend this approach for Software Engineering (and I basically believe it makes total sense in other fields as well).

Secret Handshake Anti-Pattern

I've just come across a pattern I've seen before, and wanted to get opinions on it. The code in question involves an interface like this:
public interface MyCrazyAnalyzer {
public void setOptions(AnalyzerOptions options);
public void setText(String text);
public void initialize();
public int getOccurances(String query);
}
And the expected usage is like this:
MyCrazyAnalyzer crazy = AnalyzerFactory.getAnalyzer();
crazy.setOptions(true);
crazy.initialize();
Map<String, Integer> results = new HashMap<String, Integer>();
for(String item : items) {
crazy.setText(item);
results.put(item, crazy.getOccurances);
}
There's reasons for some of this. The setText(...) and getOccurances(...) are there because there are multiple queries you might want to do after doing the same expensive analysis on the data, but this can be refactored to a result class.
Why I think this is so bad: the implementation is storing state in a way that isn't clearly indicated by the interface. I've also seen something similar involving an interface that required to call "prepareResult", then "getResult". Now, I can think of well designed code that employs some of these features. Hadoop Mapper interface extends JobConfigurable and Closeable, but I see a big difference because it's a framework that uses user code implementing those interfaces, versus a service that could have multiple implementations. I suppose anything related to including a "close" method that must be called is justified, since there isn't any other reasonable way to do it. In some cases, like JDBC, this is a consequence of a leaky abstraction, but in the two pieces of code I'm thinking of, it's pretty clearly a consequence of programmers hastily adding an interface to a spaghetti code class to clean it up.
My questions are:
Does everyone agree this is a poorly designed interface?
Is this a described anti-pattern?
Does this kind of initialization ever belong in an interface?
Does this only seem wrong to me because I have a preference for functional style and immutability?
If this is common enough to deserve a name, I suggest the "Secret Handshake" anti-pattern for an interface that forces you to call multiple methods in a particular order when the interface isn't inherently stateful (like a Collection).
Yes, it's an anti-pattern: Sequential coupling.
I'd refactor into Options - passed to the factory, and Results, returned from an analyseText() method.
I'd expect to see the AnalyzerFactory get passed the necessary params and do the construction itself; otherwise, what exactly is it doing?
Not sure if it does have a name, but it seems like it should :)
Yes, occassionally it's convenient (and the right level of abstraction) to have setters in your interface and expect classes to call them. I'd suggest that doing so requires extensive documentation of that fact.
Not really, no. A preference for immutability is certainly a good thing, and setter/bean based design can be the "right" choice sometimes too, but your given example is taking it too far.
I'm not sure whether it's a described anti-pattern but I totally agree this is a poorly designed interface. It leaves too much opportunity for error and violates at least one key principle: make your API hard to misuse.
Besides misuse, this API can also lead to hard-to-debug errors if multiple threads make use of the same instance.
Joshua Bloch actually has an excellent presentation (36m16s and 40m30s) on API design and he addresses this as one of the characteristics of a poorly designed API.
I can't see anything bad in here. setText() prepares the stage; after that, you have one or more calls to getOccurances(). Since setText() is so expensive, I can't think of any other way to do this.
getOccurances(text, query) would fix the "secret handshake" at a tremendous performance cost. You could try to cache text in getOccurances() and only update your internal caches when the text changes but that starts to look more and more like sacrifice to some OO principle. If a rule doesn't make sense, then don't apply it. Software developers have a brain for a reason.
One possible solution - use Fluent chaning. That avoids a class containing methods that need to called in a certain order. It's a lot like the builder pattern which ensures you don't read objects that are still in the middle of being populated.

Absence of property syntax in Java

C# has syntax for declaring and using properties. For example, one can declare a simple property, like this:
public int Size { get; set; }
One can also put a bit of logic into the property, like this:
public string SizeHex
{
get
{
return String.Format("{0:X}", Size);
}
set
{
Size = int.Parse(value, NumberStyles.HexNumber);
}
}
Regardless of whether it has logic or not, a property is used in the same way as a field:
int fileSize = myFile.Size;
I'm no stranger to either Java or C# -- I've used both quite a lot and I've always missed having property syntax in Java. I've read in this question that "it's highly unlikely that property support will be added in Java 7 or perhaps ever", but frankly I find it too much work to dig around in discussions, forums, blogs, comments and JSRs to find out why.
So my question is: can anyone sum up why Java isn't likely to get property syntax?
Is it because it's not deemed important enough when compared to other possible improvements?
Are there technical (e.g. JVM-related) limitations?
Is it a matter of politics? (e.g. "I've been coding in Java for 50 years now and I say we don't need no steenkin' properties!")
Is it a case of bikeshedding?
I think it's just Java's general philosophy towards things. Properties are somewhat "magical", and Java's philosophy is to keep the core language as simple as possible and avoid magic like the plague. This enables Java to be a lingua franca that can be understood by just about any programmer. It also makes it very easy to reason about what an arbitrary isolated piece of code is doing, and enables better tool support. The downside is that it makes the language more verbose and less expressive. This is not necessarily the right way or the wrong way to design a language, it's just a tradeoff.
For 10 years or so, sun has resisted any significant changes to the language as hard as they could. In the same period C# has been trough a riveting development, adding a host of new cool features with every release.
I think the train left on properties in java a long time ago, they would have been nice, but we have the java-bean specification. Adding properties now would just make the language even more confusing. While the javabean specification IMO is nowhere near as good, it'll have to do. And in the grander scheme of things I think properties are not really that relevant. The bloat in java code is caused by other things than getters and setters.
There are far more important things to focus on, such as getting a decent closure standard.
Property syntax in C# is nothing more than syntactic sugar. You don't need it, it's only there as a convenience. The Java people don't like syntactic sugar. That seems to be reason enough for its absence.
Possible arguments based on nothing more than my uninformed opinion
the property syntax in C# is an ugly
hack in that it mixes an
implementation pattern with the
language syntax
It's not really necessary, as it's fairly trivial.
It would adversly affect anyone paid based on lines of code.
I'd actually like there to be some sort of syntactical sugar for properties, as the whole syntax tends to clutter up code that's conceptually extremely simple. Ruby for one seems to do this without much fuss.
On a side note, I've actually tried to write some medium-sized systems (a few dozen classes) without property access, just because of the reduction in clutter and the size of the codebase. Aside from the unsafe design issues (which I was willing to fudge in that case) this is nearly impossible, as every framework, every library, every everything in java auto-discovers properties by get and set methods.They are with us until the very end of time, sort of like little syntactical training wheels.
I would say that it reflects the slowness of change in the language. As a previous commenter mentioned, with most IDEs now, it really is not that big of a deal. But there are no JVM specific reasons for it not to be there.
Might be useful to add to Java, but it's probably not as high on the list as closures.
Personally, I find that a decent IDE makes this a moot point. IntelliJ can generate all the getters/setters for me; all I have to do is embed the behavior that you did into the methods. I don't find it to be a deal breaker.
I'll admit that I'm not knowledgeable about C#, so perhaps those who are will overrule me. This is just my opinion.
If I had to guess, I'd say it has less to do with a philosophical objection to syntactic sugar (they added autoboxing, enhanced for loops, static import, etc - all sugar) than with an issue with backwards compatibility. So far at least, the Java folks have tried very hard to design the new language features in such a way that source-level backwards compatibility is preserved (i.e. code written for 1.4 will still compile, and function, without modification in 5 or 6 or beyond).
Suppose they introduce the properties syntax. What, then does the following mean:
myObj.attr = 5;
It would depend on whether you're talking about code written before or after the addition of the properties feature, and possibly on the definition of the class itself.
I'm not saying these issues couldn't be resolved, but I'm skeptical they could be resolved in a way that led to a clean, unambiguous syntax, while preserving source compatibility with previous versions.
The python folks may be able to get away with breaking old code, but that's not Java's way...
According to Volume 2 of Core Java (Forgotten the authors, but it's a very popular book), the language designers thought it was a poor idea to hide a method call behind field access syntax, and so left it out.
It's the same reason that they don't change anything else in Java - backwards-compatibility.
- Is it because it's not deemed important enough when compared to other possible improvements?
That's my guess.
- Are there technical (e.g. JVM-related) limitations?
No
- Is it a matter of politics? (e.g. "I've been coding in Java for 50 years now and I say: we don't need no steenkin' properties!")
Most likely.
- Is it a case of bikeshedding?
Uh?
One of the main goals of Java was to keep the language simple.
From the: Wikipedia
Java suppresses several features [...] for classes in order to simplify the language and to prevent possible errors and anti-pattern design.
Here are a few little bits of logic that, for me, lead up to not liking properties in a language:
Some programming structures get used because they are there, even if they support bad programming practices.
Setters imply mutable objects. Something to use sparsely.
Good OO design you ask an object to do some business logic. Properties imply that you are asking it for data and manipulating the data yourself.
Although you CAN override the methods in setters and getters, few ever do; also a final public variable is EXACTLY the same as a getter. So if you don't have mutable objects, it's kind of a moot point.
If your variable has business logic associated with it, the logic should GENERALLY be in the class with the variable. IF it does not, why in the world is it a variable??? it should be "Data" and be in a data structure so it can be manipulated by generic code.
I believe Jon Skeet pointed out that C# has a new method for handling this kind of data, Data that should be compile-time typed but should not really be variables, but being that my world has very little interaction with the C# world, I'll just take his word that it's pretty cool.
Also, I fully accept that depending on your style and the code you interact with, you just HAVE to have a set/get situation every now and then. I still average one setter/getter every class or two, but not enough to make me feel that a new programming structure is justified.
And note that I have very different requirements for work and for home programming. For work where my code must interact with the code of 20 other people I believe the more structured and explicit, the better. At home Groovy/Ruby is fine, and properties would be great, etc.
You may not need for "get" and "set" prefixes, to make it look more like properties, you may do it like this:
public class Person {
private String firstName = "";
private Integer age = 0;
public String firstName() { return firstName; } // getter
public void firstName(String val) { firstName = val; } // setter
public Integer age() { return age; } // getter
public void age(Integer val) { age = val; } //setter
public static void main(String[] args) {
Person p = new Person();
//set
p.firstName("Lemuel");
p.age(40);
//get
System.out.println(String.format("I'm %s, %d yearsold",
p.firstName(),
p.age());
}
}

Categories