I understand what public, private, and protected do. I know that you are supposed to use them to comply with the concept of Object Oriented Programming, and I know how to implement them in a program using multiple classes.
My question is: Why do we do this? Why shouldn't I have one class modifying the global variables of another class directly? And even if you shouldn't why are the protected, private, and public modifiers even necessary? It's as if programmers don't trust themselves not to do it, even though they are the ones writing the program.
Thanks in advance.
You're right, it's because we can't trust ourselves. Mutable state is a major factor in complexity of computer programs, it's too easy to build something that seems ok at first and later grows out of control as the system gets bigger. Restricting access helps to reduce the opportunities for objects' states to change in unpredictable ways. The idea is for objects to communicate with each other through well-defined channels, as opposed to tweaking each others' data directly. That way we have some hope of testing the individual objects and having some confidence in how they'll behave as part of a larger system.
Let me give a basic example (this is only for illustration):
class Foo {
void processBar() {
Bar bar = new Bar();
bar.value = 10;
bar.process();
}
}
class Bar {
public int value;
public void process() {
// Say some code
int compute = 10/value;
// Her you have to write some code to handle
// exception
}
}
Every thing looks good and you are happy. Now later you realized that other developers or your other apis that you are using to set the value are setting to 0 and this is leading to exception in your Bar.process() function.
Now according to above implementation there is no way you can restrain users from setting it to 0. Now look at below implementation.
class Foo {
void processBar() {
Bar bar = new Bar();
bar.setValue(0);
bar.process();
}
}
class Bar {
public int value;
public void setValue(int value) {
if(value == 0)
throw new IllegalArgumentException("value = 0 is not allowed");
this.value = value;
}
public void process() {
// Say some code
int compute = 10/value;
// No need to write exception handling code
// so in theory can give u better performance too
}
}
Not only you can now put check but also give a informative exception which can help figuring errors quickly and at early stage.
This is just one of the examples, the basics of OOP (Encapsulation, Abstraction etc.), help you standardize interface and hide the underneath implementation.
Keep in mind that the developer coding a given class may not be the only one using it. Teams of developers write software libraries, which in Java are commonly distributed as JARs, used by completely different teams of developers. If these standards weren't in place, it would be very difficult for others to know, at a minimum, what the intent was of any available variables / methods.
If I have a private/protected instance variable, for example, I may have a public "setter" method that checks for validity, preconditions, and performs other activities - all which would be bypassed if anyone was freely able to modify the instance variable directly.
Another good point is in the Java documentation / tutorial: http://docs.oracle.com/javase/tutorial/java/javaOO/accesscontrol.html :
Public fields tend to link you to a particular implementation and
limit your flexibility in changing your code.
Non-local behavior is difficult to reason about.
Minimizing surface area increases comprehension.
I don't trust myself to remember all behavioral side-effects.
I definitely don't trust "you" to understand all behavioral side-effects.
The less I expose the more flexibility I have to modify and extend.
My question is: Why do we do this?
Basically, because by restricting ourselves in this way, we make it easier ... for ourselves, and others who may need to read / modify the code in the future ... to understand the code, and the way that the various parts interact.
Most developers understand code by a mental process of abstraction; i.e. mentally drawing boundaries around bits of code, understanding each bit in isolation, and then understanding how each bit interacts with other bits. If any part of the code could potentially mess around with the "innards" of any other part of the code, then it makes it hard for the typical developer to understand what is going on.
This may not be a problem for you while you are writing the code, because you may be able to keep all of the complex interactions in your head while you create the code. But in a year or two's time, you will have forgotten a lot of the details. And other people never had the details in their heads to start with.
Why shouldn't I have one class modifying the global variables of another class directly?
Because it makes your code harder to understand; see above. The larger your codebase is, the more pronounced the problem will be.
Another point is that if you use over-use globals (statics actually) then you create problems if your code needs to be multi-threaded / reentrant, for unit testing, and if you need to reuse your code in other contexts.
And even if you shouldn't why are the protected, private, and public modifiers even necessary? It's as if programmers don't trust themselves not to do it, even though they are the ones writing the program.
It is not about trust. It is about expressing in the source code where the boundaries are.
If I write a class and declare a method or a field private, I know that I don't have to consider the problem of what happens if some other class calls it / accesses it / modifies it. If I'm reading someone elses code, I know that I can (initially) ignore the private parts when mapping the interactions and boundaries. The private and protected modifiers and package private just
provide different granularities of boundary.
(Or maybe it is about trust; i.e. not trusting ourselves to remember where the abstraction boundaries in our design are / were.)
There are basically two reasons:
1) There are interfaces in Java where security is required. For instance, when running a java applet on your box you want to assure that the applet can't access parts of the file system to which it's not authorized. Without enforcible security, an applet could reach into the Java security layer and modify its own authorities.
2) Even when everyone is "trusted", sometimes expediency trumps common sense, and programmers bypass APIs to access internal interfaces rather than getting the APIs enhanced (which admittedly can often take longer than practical). This creates problems both for stability and for upgrade compatibility.
(There's a legend, deep in the ancient history of computing, of an OS that was treated this way by the application programmers, to the extent that the programmers maintaining the OS were forced to make sure that certain code sections (not entry points, but actual internal code sequences) didn't change physical addresses when the OS was revised.)
Note that these problems existed before the OOP paradigm became common, and are some of the motivation for OOP. OOP isn't an arbitrary religious doctrine invented out of thin air, but is a set of principles that have been filtered through about 6 decades of programming experience.
Related
I recently refactored some code which converted a public method that was only being used in conjure with another public method, into one call.
public class service() {
public String getAuthenticatedUserName() {
return SecurityContext.getName();
}
public getIdentityUserIdByUsername(String username) {
return db.getUser(username).getId();
}
}
which was being utilised in a few other classes as service.getIdentityUserIdByUsername(service.getUsername()), which seemed redudant. A new method was created combining the two calls.
public getIdentityUserId() {
return getIdentityUserIdByUsername(getUsername());
}
The getIdentityUserIdByUsername() is still being utilised in other classes without the need for getUsername(). However, the getUserName() method is no longer used in other classes.
My example is much simpler than the implementation, the method has test coverage that is a bit awkward to do (mocking static classes without Powermock and a bit of googling etc). In the future it's likely we will need the getUsername() method, and the method will not change.
It was suggested in code review that the getUsername() method should now be private due to it not being called anywhere else. This would require the explicit tests for the method be removed/commented out which seems like it would be repeated effort to rewrite or ugly to leave commented out code.
Is it best practice to change the method to private or leave it public because it has explicit coverage and you might need it in the future?
Is it best practice to change the method to private or leave it public because it has explicit coverage and you might need it in the future?
IMO, you are asking the wrong question. So called "best practice" doesn't come into it. (Read the references below!)
The real question is which of the alternatives is / are most likely to be best for you. That is really for you to decide. Not us.
The alternatives are:
You could remove the test case for the private method.
You could comment out the test case.
You could fix the test case so that it runs with the private version of the method.
You could leave the method as public.
To make a rational decision, you need to consider the technical and non-technical pros and cons of each alternative ... in the context of your project. But don't be too concerned about making the wrong decision. In the big picture, it is highly unlikely that making the wrong choice will have serious consequences.
Finally, I would advise to avoid dismissing options just because they are "code smell". That phrase has the same issue as "best practice". It causes you to dismiss valid options based on generalizations ... and current opinions (even fashions) on what is good or bad "practice".
Since you want someone else's opinion ("best practice" is just opinion!), mine is that all of the alternatives are valid. But my vote would be to leave the method as public. It is the least amount of work, and an unused method in an API does little harm. And as you say, there is a reasonable expectation that the method will be used in the future.
You don't need to agree with your code reviewer. (But this is not worth making enemies over ...)
References:
No Best Practices by James Bach
There is no such thing as "Best Practices": Context Matters. by Ted Neward.
It can make sense to want to test private methods. The industry standard way to do this, which has quite some advantages, is this:
Ensure that the test code lives in the same package as the code it tries to test. That doesn't mean same directory; for example, have src/main/java/pkg/MyClass.java and src/test/java/pkg/MyClassTest.java.
Make your private methods package private instead. Annotate them with #VisibleForTesting (from guava) if you want some record of this.
Separately from this, the entry space for public methods (public in the sense of: This is part of my API and defines the access points where external code calls my code) is normally some list of entrypoints.. if you have it at all. More often there is no such definition at all. One could say that all public methods in all public types implicitly form the list (i.e. that the keyword public implies that it is for consumption by external code), which then by tautology decrees that any public method has the proper signature. Not a very useful definition. In practice, the keyword public does not have to mean 'this is API accessible'. Various module systems (such as jigsaw or OSGi) have solutions for this, generally by letting you declare certain packages as actually public.
With such tooling, 'treeshaking' your public methods to point out that they need no longer be public makes sense. Without them... you can't really do this. There is such a notion as 'this method is never called in my codebase, but it is made available to external callers; callers that I don't have available here, and the point is that this is released, and there are perhaps projects that haven't even started being written yet which are intended to call this'.
Assuming you do have the tree-shaking concept going, you can still leave them in for that 'okay maybe not today but tomorrow perhaps' angle. If that applies, leave it in. If you can't imagine any use case where external code needs access to it, just delete it. If it really needs to be recovered, hey, there's always the history in version control.
If the method is a public static then you can leave it as is because there is no impact of it being public. It is aside effect free method, it being exposed will never cause any harm.
If it is a object level public method then -
1) Keep it if it is like an API. It has well defined input, output and delivers a well defined functionality and has tests associated with it. It being public doesn't harm anything.
2) Make it private immediately if it has side effects. If it causes others methods to behave differently because it changes the state of the object then it is harmful being public.
I am in a Introduction to Java class and I was doing a bit of research on variables. It seems that knowledgeable programers state that it is bad practice to define the variables in public visibility. I see them stating it is bad practice but I can not find a rhyme or reason to their claims. This is how I defined my variables in a application for my course.
public class DykhoffWk3Calculator
{
/*
* This class is used to define the variables in a static form so all
* classes can access them.
*/
public static double commissionRate = .03, startSalary = 45000,
accelerationFactor = 1.25;
public static double annualSales, commissionTotal, totalCompensation,
total, count, count2;
private static Object input; Object keyboard;
public static class UserInput
{ //Then continue with my other classes
I thought this was a logical method of defining them so all classes, not just main, could access them. Can someone explain to me why this is bad practice, and where variables should be defined? Any assistance would be greatly appreciated.
In short: because all of your public "surface area" for a class effectively defines its API.
If you expose things through methods, then you can change the details of how they work later. But if you expose a field, and some other class outside of your control (and quite possibly outside of your knowledge) starts referencing that field, then you're stuck with exposing that field for the rest of time. Or until you decide to break backwards-compatibility.
I thought this was a logical method of defining them so all classes, not just main, could access them.
As a general rule, you don't want "all classes" to access them. The vast majority of work with software, is spent maintaining code, not writing it for the first time. And so experienced developers realise that best practices for code, are generally the ones that make it most maintainable, not necessarily the ones that make it most convenient to write in the first place.
And if you have a variable that could be accessed from anywhere, at any time, and you want to make some tweaks to how it is modified - how can you be sure that this is safe? How long will it take you to track down all the ways that this is referenced, and determine what the effects of your change will be? (And specific to public fields, you can kiss goodbye to any sort of reusability regarding running at the same time from multiple threads, or running reentrantly.)
Broadly speaking, reducing the "surface area" of classes is a really good thing to do. Restricting the ways that other classes can interact with this one, makes it much easier to control and understand the relationships, as well as making it easier to make internal changes "invisible" to those other classes. Think about what this class does, what it will provide to other classes, as defining an interface (whether an actual interface or not). And only expose to other classes, the bare minimum that is required to fulfil those requirements.
And that never involves letting them have arbitrary access to variables.
So the general point is that you in fact DON'T want anyone to be able to access those values. Not only can I see those variables, but I can also change them to anything I like. This can lead to problems in larger, more complex programs.
Furthermore, if you wanted to later change how the class uses/stores these values, you couldn't without having to go out and change all the other classes that access those public variables directly. Instead, you should offer methods that provide just the amount of access that you want to give.
The standard analogy is that of driving a car. You know how to turn the wheel, hit the brake, etc, but not how the car actually does these things. So if the engine needed to be dramatically changed, or you got in a new car, then you'd still know how to drive. You don't need to worry about what's happening behind the scenes.
Firstly you state it wrong.
its bad to make your variable public i.e:
public String name = null; this is bad. You should always do it as
private String name = null;
To understand why, you need to dig a bit into the ideology of OOPs
OPPS ideology states that each object of your class will have 2 things:
Properties: something which we also call variables or state.
Behavior: something which we call methods or functions.
Properties identify the object over a period of time. Behaviors allow you to manage the properties of the object so that the same object over time can appear to be in different states.e.g: a Product object over a period of can be an 'Available line item' or 'Added to cart' or 'Sold' or 'Out of stock' depending on its state. Since state is critically important to the object so the object should not allow direct nonsense mutation operations on its state. Objects should keep their variables private to them and expose behaviors that the outside world can use to interact with the object and change the state based on the operation executed in the behavior. e.g: calling the 'addToCart()' behavior on the Product object that was in 'Available line item' state would probably mean: changing not just its state to 'Added to cart' but probably making other users aware that the number of this Products now available is 1 less.
So long story short: don't expose properties directly to outside work for mutation unless needed. This means dont make them public and also dont give setter methods if not needed.
By Convention Fields, methods and constructors declared public (least restrictive) within a public class are visible to any class in the Java program, whether these classes are in the same package or in another package.Which means that a change in the value of a field will definitely affect other classes accessing that field..thus breaking the whole sense of encapsulation.
Public variables in general in a class are a bad idea. Since this means other classes/programs, can modify the state of instances.
Since it is the responsibility of a class to protect its state and ensure the state is "consistent", one can enforce this by defining public setters (since this allows to run code to check/repair state).
By setting the variables public, the state is not protected. If later not all representable states are valid states, one has a problem.
Example:
Say you want to implement an ArrayList<T>, then it will look like (not fully implemented):
public class ArrayList<T> {
public int size = 0;
public Object[] data = new Object[5];
}
Now one can modify the size of the arrayList. Without adding an element. Now if you would ask the ArrayList<T> instance to remove/add/copy/...whatever, the data on which it works can be wrong.
Perhaps you can claim that a programmer is nice: he will not modify the object unless he needs to and according to the "rules". But such things eventually always go wrong, and what if you decide to modify your definition of the ArrayList (for instance using two int's for the size). In that case you would need to rewrite all code that sets such fields.
To conclude: private/protected is invented to protect a class instance from other instances that would turn the instance corrupt/invalid/inconsistent/...
This is a question of curiosity about accepted coding practices. I'm (primarily) a Java developer, and have been increasingly making efforts to unit test my code. I've spent some time looking at how to write the most testable code, paying particular attention to Google's How to write untestable code guide (well worth a look, if you haven't seen it).
Naturally, I was arguing recently with a more C++-oriented friend about the advantages of each language's inheritance model, and I thought I'd pull out a trump card by saying how much harder C++ programmers made it to test their code by constantly forgetting the virtual keyword (for C++ers - this is the default in Java; you get rid of it using final).
I posted a code example that I thought would demonstrate the advantages of Java's model quite well (the full thing is over on GitHub). The short version:
class MyClassForTesting {
private final Database mDatabase;
private final Api mApi;
void myFunctionForTesting() {
for (User u : mDatabase.getUsers()) {
mRemoteApi.updateUserData(u);
}
}
MyClassForTesting ( Database usersDatabase, Api remoteApi) {
mDatabase = userDatabase;
mRemoteApi = remoteApi;
}
}
Regardless of the quality of what I've written here, the idea is that the class needs to make some (potentially quite expensive) calls to a database, and some API (maybe on a remote web server). myFunctionForTesting() doesn't have a return type, so how do you unit test this? In Java, I think the answer isn't too difficult - we mock:
/*** Tests ***/
/*
* This will record some stuff and we'll check it later to see that
* the things we expect really happened.
*/
ActionRecorder ar = new ActionRecorder();
/** Mock up some classes **/
Database mockedDatabase = new Database(ar) {
#Override
public Set<User> getUsers() {
ar.recordAction("got list of users");
/* Excuse my abuse of notation */
return new Set<User>( {new User("Jim"), new User("Kyle")} );
}
Database(ActionRecorder ar) {
this.ar = ar;
}
}
Api mockApi = new Api() {
#Override
public void updateUserData(User u) {
ar.recordAction("Updated user data for " + u.name());
}
Api(ActionRecorder ar) {
this.ar = ar;
}
}
/** Carry out the tests with the mocked up classes **/
MyClassForTesting testObj = new MyClassForTesting(mockDatabase, mockApi);
testObj.myFunctionForTesting();
// Check that it really fetches users from the database
assert ar.contains("got list of users");
// Check that it is checking the users we passed it
assert ar.contains("Updated user data for Jim");
assert ar.contains("Updated user data for Kyle");
By mocking up these classes, we inject the dependencies with our own light-weight versions that we can make assertions on for unit testing, and avoid making expensive, time-consuming calls to database/api-land. The designers of Database and Api don't have to be too aware that this is what we're going to do, and the designer of MyClassForTesting certainly doesn't have to know! This seems (to me) like a pretty good way to do things.
My C++ friend, however, retorted that this was a dreadful hack, and there's a good reason C++ won't let you do this! He then presented a solution based on Generics, which does much the same thing. For brevity's sake, I'll just list a part of the solution he gave, but again you can find the whole thing over on Github.
template<typename A, typename D>
class MyClassForTesting {
private:
A mApi;
D mDatabase;
public MyClassForTesting(D database, A api) {
mApi = api;
mDatabase = database;
}
...
};
Which would then be tested much like before, but with the important bits that get replaced shown below:
class MockDatabase : Database {
...
}
class MockApi : Api {
...
}
MyClassForTesting<MockApi, MockDatabase>
testingObj(MockApi(ar), MockDatabase(ar));
So my question is this: What's the preferred method? I always thought the polymorphism-based approach was better - and I see no reason it wouldn't be in Java - but is it normally considered better to use Generics than Virtualise everything in C++? What do you do in your code (assuming you do unit test) ?
I'm probably biased, but I'd say the C++ version is better. Among other things, polymorphism carries some cost. In this case, you're making your users pay that cost, even though they receive no direct benefit from it.
If, for example, you had a list of polymorphic objects, and want to manipulate all of them via the base class, that would justify using polymorphism. In this case, however, the polymorphism is being used for something the user never even sees. You've built in the ability to manipulate polymorphic objects, but never really used it -- for testing you'll only have mock objects, and for real use you'll only have real objects. There will never be a time that you have (for example) an array of database objects, some of which are mock databases and others of which are real databases.
This is also much more than just an efficiency issue (or at least a run-time efficiency issue). The relationships in your code should be meaningful. When somebody sees (public) inheritance, that should tell them something about the design. As you've outlined it in Java, however, the public inheritance relationship involved is basically a lie -- i.e. what he should know from it (that you're dealing with polymorphic descendants) is an outright falsehood. The C++ code, by contrast, correctly conveys the intent to the reader.
To an extent, I'm overstating the case there, of course. People who normally read Java are almost certainly well accustomed to the way inheritance is typically abused, so they don't see this as a lie at all. This is a bit of throwing out the baby with the bathwater though -- instead of seeing the "lie" for what it is, they've learned to completely ignore what inheritance really means (or just never knew, especially if they went to college where Java was the primary vehicle for teaching OOP). As I said, I'm probably somewhat biased, but to to me this makes (most) Java code much more difficult to understand. You basically have to be careful to ignore the basic principles of OOP, and get accustomed to its constant abuse.
Some key advice is "prefer composition to inheritence", which is what your MyClassForTesting has done with respect to the Database and Api. This is good C++ advice too: IIRC it is in Effective C++.
It is a bit rich for your friend to claim that using polymorphism is a "dreadful hack" but using templates is not. On what basis does (s)he claim that one is less hacky than the other? I see none, and I use both all the time in my C++ code.
I'd say the polymorphism approach (as you have done) is better. Consider that Database and Api might be interfaces. In that case you are explicitly declaring the API used by MyClassForTesting: someone can read the Api.java and Database.java files. And you are loosely coupling the modules: the Api and Database interfaces will naturally be the narrowest acceptable interfaces, much narrower than the public interface of any concerete class that implements them.
More importantly, you cannot create templated virtual functions. This makes it impossible to test functions in C++ which use templates, by using inheritance, and therefore testing by inheritance in C++ is unreliable as you cannot test all classes that way, and definitely not every use of a base class can be substituted with that of a derived class, especially w.r.t instantiating templates of them. Of course, templates introduce their own problems, but I think that's beyond the scope of the question.
You're throwing inheritance at the problem but really it's not the right solution- you only need to change between the mock and the real at compile time, not at run time. This fundamental fact makes templates the better option.
In C++, we don't forget the virtual keyword, we just don't need it, because run-time polymorphism should only occur when you need to vary the type at run-time. Else, you're firing a rocket launcher at a nail.
I keep hearing the statement on most programming related sites:
Program to an interface and not to an Implementation
However I don't understand the implications?
Examples would help.
EDIT: I have received a lot of good answers even so could you'll supplement it with some snippets of code for a better understanding of the subject. Thanks!
You are probably looking for something like this:
public static void main(String... args) {
// do this - declare the variable to be of type Set, which is an interface
Set buddies = new HashSet();
// don't do this - you declare the variable to have a fixed type
HashSet buddies2 = new HashSet();
}
Why is it considered good to do it the first way? Let's say later on you decide you need to use a different data structure, say a LinkedHashSet, in order to take advantage of the LinkedHashSet's functionality. The code has to be changed like so:
public static void main(String... args) {
// do this - declare the variable to be of type Set, which is an interface
Set buddies = new LinkedHashSet(); // <- change the constructor call
// don't do this - you declare the variable to have a fixed type
// this you have to change both the variable type and the constructor call
// HashSet buddies2 = new HashSet(); // old version
LinkedHashSet buddies2 = new LinkedHashSet();
}
This doesn't seem so bad, right? But what if you wrote getters the same way?
public HashSet getBuddies() {
return buddies;
}
This would have to be changed, too!
public LinkedHashSet getBuddies() {
return buddies;
}
Hopefully you see, even with a small program like this you have far-reaching implications on what you declare the type of the variable to be. With objects going back and forth so much it definitely helps make the program easier to code and maintain if you just rely on a variable being declared as an interface, not as a specific implementation of that interface (in this case, declare it to be a Set, not a LinkedHashSet or whatever). It can be just this:
public Set getBuddies() {
return buddies;
}
There's another benefit too, in that (well at least for me) the difference helps me design a program better. But hopefully my examples give you some idea... hope it helps.
One day, a junior programmer was instructed by his boss to write an application to analyze business data and condense it all in pretty reports with metrics, graphs and all that stuff. The boss gave him an XML file with the remark "here's some example business data".
The programmer started coding. A few weeks later he felt that the metrics and graphs and stuff were pretty enough to satisfy the boss, and he presented his work. "That's great" said the boss, "but can it also show business data from this SQL database we have?".
The programmer went back to coding. There was code for reading business data from XML sprinkled throughout his application. He rewrote all those snippets, wrapping them with an "if" condition:
if (dataType == "XML")
{
... read a piece of XML data ...
}
else
{
.. query something from the SQL database ...
}
When presented with the new iteration of the software, the boss replied: "That's great, but can it also report on business data from this web service?" Remembering all those tedious if statements he would have to rewrite AGAIN, the programmer became enraged. "First xml, then SQL, now web services! What is the REAL source of business data?"
The boss replied: "Anything that can provide it"
At that moment, the programmer was enlightened.
An interface defines the methods an object is commited to respond.
When you code to the interface, you can change the underlying object and your code will still work ( because your code is agnostic of WHO do perform the job or HOW the job is performed ) You gain flexibility this way.
When you code to a particular implementation, if you need to change the underlying object your code will most likely break, because the new object may not respond to the same methods.
So to put a clear example:
If you need to hold a number of objects you might have decided to use a Vector.
If you need to access the first object of the Vector you could write:
Vector items = new Vector();
// fill it
Object first = items.firstElement();
So far so good.
Later you decided that because for "some" reason you need to change the implementation ( let's say the Vector creates a bottleneck due to excessive synchronization)
You realize you need to use an ArrayList instad.
Well, you code will break ...
ArrayList items = new ArrayList();
// fill it
Object first = items.firstElement(); // compile time error.
You can't. This line and all those line who use the firstElement() method would break.
If you need specific behavior and you definitely need this method, it might be ok ( although you won't be able to change the implementation ) But if what you need is to simply retrieve the first element ( that is , there is nothing special with the Vector other that it has the firstElement() method ) then using the interface rather than the implementation would give you the flexibility to change.
List items = new Vector();
// fill it
Object first = items.get( 0 ); //
In this form you are not coding to the get method of Vector, but to the get method of List.
It does not matter how do the underlying object performs the method, as long as it respond to the contract of "get the 0th element of the collection"
This way you may later change it to any other implementation:
List items = new ArrayList(); // Or LinkedList or any other who implements List
// fill it
Object first = items.get( 0 ); // Doesn't break
This sample might look naive, but is the base on which OO technology is based ( even on those language which are not statically typed like Python, Ruby, Smalltalk, Objective-C etc )
A more complex example is the way JDBC works. You can change the driver, but most of your call will work the same way. For instance you could use the standard driver for oracle databases or you could use one more sophisticated like the ones Weblogic or Webpshere provide . Of course it isn't magical you still have to test your product before, but at least you don't have stuff like:
statement.executeOracle9iSomething();
vs
statement.executeOracle11gSomething();
Something similar happens with Java Swing.
Additional reading:
Design Principles from Design Patterns
Effective Java Item: Refer to objects by their interfaces
( Buying this book the one of the best things you could do in life - and read if of course - )
My initial read of that statement is very different than any answer I've read yet. I agree with all the people that say using interface types for your method params, etc are very important, but that's not what this statement means to me.
My take is that it's telling you to write code that only depends on what the interface (in this case, I'm using "interface" to mean exposed methods of either a class or interface type) you're using says it does in the documentation. This is the opposite of writing code that depends on the implementation details of the functions you're calling. You should treat all function calls as black boxes (you can make exceptions to this if both functions are methods of the same class, but ideally it is maintained at all times).
Example: suppose there is a Screen class that has Draw(image) and Clear() methods on it. The documentation says something like "the draw method draws the specified image on the screen" and "the clear method clears the screen". If you wanted to display images sequentially, the correct way to do so would be to repeatedly call Clear() followed by Draw(). That would be coding to the interface. If you're coding to the implementation, you might do something like only calling the Draw() method because you know from looking at the implementation of Draw() that it internally calls Clear() before doing any drawing. This is bad because you're now dependent on implementation details that you can't know from looking at the exposed interface.
I look forward to seeing if anyone else shares this interpretation of the phrase in the OP's question, or if I'm entirely off base...
It's a way to separate responsibilities / dependancies between modules.
By defining a particular Interface (an API), you ensure that the modules on either side of the interface won't "bother" one another.
For example, say module 1 will take care of displaying bank account info for a particular user, and module2 will fetch bank account info from "whatever" back-end is used.
By defining a few types and functions, along with the associated parameters, for example a structure defining a bank transaction, and a few methods (functions) like GetLastTransactions(AccountNumber, NbTransactionsWanted, ArrayToReturnTheseRec) and GetBalance(AccountNumer), the Module1 will be able to get the needed info, and not worry about how this info is stored or calculated or whatever. Conversely, the Module2 will just respond to the methods call by providing the info as per the defined interface, but won't worry about where this info is to be displayed, printed or whatever...
When a module is changed, the implementation of the interface may vary, but as long as the interface remains the same, the modules using the API may at worst need to be recompiled/rebuilt, but they do not need to have their logic modified in anyway.
That's the idea of an API.
At its core, this statement is really about dependencies. If I code my class Foo to an implementation (Bar instead of IBar) then Foo is now dependent on Bar. But if I code my class Foo to an interface (IBar instead of Bar) then the implementation can vary and Foo is no longer dependent on a specific implementation. This approach gives a flexible, loosely-coupled code base that is more easily reused, refactored and unit tested.
Take a red 2x4 Lego block and attach it to a blue 2x4 Lego block so one sits atop the other. Now remove the blue block and replace it with a yellow 2x4 Lego block. Notice that the red block did not have to change even though the "implementation" of the attached block varied.
Now go get some other kind of block that does not share the Lego "interface". Try to attach it to the red 2x4 Lego. To make this happen, you will need to change either the Lego or the other block, perhaps by cutting away some plastic or adding new plastic or glue. Notice that by varying the "implementation" you are forced to change it or the client.
Being able to let implementations vary without changing the client or the server - that is what it means to program to interfaces.
An interface is like a contract between you and the person who made the interface that your code will carry out what they request. Furthermore, you want to code things in such a way that your solution can solve the problem many times over. Think code re-use. When you are coding to an implementation, you are thinking purely of the instance of a problem that you are trying to solve. So when under this influence, your solutions will be less generic and more focused. That will make writing a general solution that abides by an interface much more challenging.
Look, I didn't realize this was for Java, and my code is based on C#, but I believe it provides the point.
Every car have doors.
But not every door act the same, like in UK the taxi doors are backwards. One universal fact is that they "Open" and "Close".
interface IDoor
{
void Open();
void Close();
}
class BackwardDoor : IDoor
{
public void Open()
{
// code to make the door open the "wrong way".
}
public void Close()
{
// code to make the door close properly.
}
}
class RegularDoor : IDoor
{
public void Open()
{
// code to make the door open the "proper way"
}
public void Close()
{
// code to make the door close properly.
}
}
class RedUkTaxiDoor : BackwardDoor
{
public Color Color
{
get
{
return Color.Red;
}
}
}
If you are a car door repairer, you dont care how the door looks, or if it opens one way or the other way. Your only requirement is that the door acts like a door, such as IDoor.
class DoorRepairer
{
public void Repair(IDoor door)
{
door.Open();
// Do stuff inside the car.
door.Close();
}
}
The Repairer can handle RedUkTaxiDoor, RegularDoor and BackwardDoor. And any other type of doors, such as truck doors, limousine doors.
DoorRepairer repairer = new DoorRepairer();
repairer.Repair( new RegularDoor() );
repairer.Repair( new BackwardDoor() );
repairer.Repair( new RedUkTaxiDoor() );
Apply this for lists, you have LinkedList, Stack, Queue, the normal List, and if you want your own, MyList. They all implement the IList interface, which requires them to implement Add and Remove. So if your class add or remove items in any given list...
class ListAdder
{
public void PopulateWithSomething(IList list)
{
list.Add("one");
list.Add("two");
}
}
Stack stack = new Stack();
Queue queue = new Queue();
ListAdder la = new ListAdder()
la.PopulateWithSomething(stack);
la.PopulateWithSomething(queue);
Allen Holub wrote a great article for JavaWorld in 2003 on this topic called Why extends is evil. His take on the "program to the interface" statement, as you can gather from his title, is that you should happily implement interfaces, but very rarely use the extends keyword to subclass. He points to, among other things, what is known as the fragile base-class problem. From Wikipedia:
a fundamental architectural problem of object-oriented programming systems where base classes (superclasses) are considered "fragile" because seemingly safe modifications to a base class, when inherited by the derived classes, may cause the derived classes to malfunction. The programmer cannot determine whether a base class change is safe simply by examining in isolation the methods of the base class.
In addition to the other answers, I add more:
You program to an interface because it's easier to handle. The interface encapsulates the behavior of the underlying class. This way, the class is a blackbox. Your whole real life is programming to an interface. When you use a tv, a car, a stereo, you are acting on its interface, not on its implementation details, and you assume that if implementation changes (e.g. diesel engine or gas) the interface remains the same. Programming to an interface allows you to preserve your behavior when non-disruptive details are changed, optimized, or fixed. This simplifies also the task of documenting, learning, and using.
Also, programming to an interface allows you to delineate what is the behavior of your code before even writing it. You expect a class to do something. You can test this something even before you write the actual code that does it. When your interface is clean and done, and you like interacting with it, you can write the actual code that does things.
"Program to an interface" can be more flexible.
For example, we are writing a class Printer which provides print service. currently there are 2 class (Cat and Dog) need to be printed. So we write code like below
class Printer
{
public void PrintCat(Cat cat)
{
...
}
public void PrintDog(Dog dog)
{
...
}
...
}
How about if there is a new class Bird also needs this print service? We have to change Printer class to add a new method PrintBird(). In real case, when we develop Printer class, we may have no idea about who will use it. So how to write Printer? Program to an interface can help, see below code
class Printer
{
public void Print(Printable p)
{
Bitmap bitmap = p.GetBitmap();
// print bitmap ...
}
}
With this new Printer, everything can be printed as long as it implements Interface Printable. Here method GetBitmap() is just a example. The key thing is to expose an Interface not a implementation.
Hope it's helpful.
Essentially, interfaces are the slightly more concrete representation of general concepts of interoperation - they provide the specification for what all the various options you might care to "plug in" for a particular function should do similarly so that code which uses them won't be dependent on one particular option.
For instance, many DB libraries act as interfaces in that they can operate with many different actual DBs (MSSQL, MySQL, PostgreSQL, SQLite, etc.) without the code that uses the DB library having to change at all.
Overall, it allows you to create code that's more flexible - giving your clients more options on how they use it, and also potentially allowing you to more easily reuse code in multiple places instead of having to write new specialized code.
By programming to an interface, you are more likely to apply the low coupling / high cohesion principle.
By programming to an interface, you can easily switch the implementation of that interface (the specific class).
It means that your variables, properties, parameters and return types should have an interface type instead of a concrete implementation.
Which means you use IEnumerable<T> Foo(IList mylist) instead of ArrayList Foo(ArrayList myList) for example.
Use the implementation only when constructing the object:
IList list = new ArrayList();
If you have done this you can later change the object type maybe you want to use LinkedList instead of ArrayList later on, this is no problem since everywhere else you refer to it as just "IList"
It's basically where you make a method/interface like this: create( 'apple' ) where the method create(param) comes from an abstract class/interface fruit that is later implemented by concrete classes. This is different than subclassing. You are creating a contract that classes must fulfill. This also reduces coupling and making things more flexible where each concrete class implements it differently.
The client code remains unaware of the specific types of objects used and remains unaware of the classes that implement these objects. Client code only knows about the interface create(param) and it uses it to make fruit objects. It's like saying, "I don't care how you get it or make it I, just want you to give it to me."
An analogy to this is a set of on and off buttons. That is an interface on() and off(). You can use these buttons on several devices, a TV, radio, light. They all handle them differently but we don't care about that, all we care about is to turn it on or turn it off.
Coding to an interface is a philosophy, rather than specific language constructs or design patterns - it instructs you what is the correct order of steps to follow in order to create better software systems (e.g. more resilient, more testable, more scalable, more extendible, and other nice traits).
What it actually means is:
===
Before jumping to implementations and coding (the HOW) - think of the WHAT:
What black boxes should make up your system,
What is each box' responsibility,
What are the ways each "client" (that is, one of those other boxes, 3rd party "boxes", or even humans) should communicate with it (the API of each box).
After you figure the above, go ahead and implement those boxes (the HOW).
Thinking first of what a box' is and what its API, leads the developer to distil the box' responsibility, and to mark for himself and future developers the difference between what is its exposed details ("API") and it's hidden details ("implementation details"), which is a very important differentiation to have.
One immediate and easily noticeable gain is the team can then change and improve implementations without affecting the general architecture. It also makes the system MUCH more testable (it goes well with the TDD approach).
===
Beyond the traits I've mentioned above, you also save A LOT OF TIME going this direction.
Micro Services and DDD, when done right, are great examples of "Coding to an interface", however the concept wins in every pattern from monoliths to "serverless", from BE to FE, from OOP to functional, etc....
I strongly recommend this approach for Software Engineering (and I basically believe it makes total sense in other fields as well).
C# has syntax for declaring and using properties. For example, one can declare a simple property, like this:
public int Size { get; set; }
One can also put a bit of logic into the property, like this:
public string SizeHex
{
get
{
return String.Format("{0:X}", Size);
}
set
{
Size = int.Parse(value, NumberStyles.HexNumber);
}
}
Regardless of whether it has logic or not, a property is used in the same way as a field:
int fileSize = myFile.Size;
I'm no stranger to either Java or C# -- I've used both quite a lot and I've always missed having property syntax in Java. I've read in this question that "it's highly unlikely that property support will be added in Java 7 or perhaps ever", but frankly I find it too much work to dig around in discussions, forums, blogs, comments and JSRs to find out why.
So my question is: can anyone sum up why Java isn't likely to get property syntax?
Is it because it's not deemed important enough when compared to other possible improvements?
Are there technical (e.g. JVM-related) limitations?
Is it a matter of politics? (e.g. "I've been coding in Java for 50 years now and I say we don't need no steenkin' properties!")
Is it a case of bikeshedding?
I think it's just Java's general philosophy towards things. Properties are somewhat "magical", and Java's philosophy is to keep the core language as simple as possible and avoid magic like the plague. This enables Java to be a lingua franca that can be understood by just about any programmer. It also makes it very easy to reason about what an arbitrary isolated piece of code is doing, and enables better tool support. The downside is that it makes the language more verbose and less expressive. This is not necessarily the right way or the wrong way to design a language, it's just a tradeoff.
For 10 years or so, sun has resisted any significant changes to the language as hard as they could. In the same period C# has been trough a riveting development, adding a host of new cool features with every release.
I think the train left on properties in java a long time ago, they would have been nice, but we have the java-bean specification. Adding properties now would just make the language even more confusing. While the javabean specification IMO is nowhere near as good, it'll have to do. And in the grander scheme of things I think properties are not really that relevant. The bloat in java code is caused by other things than getters and setters.
There are far more important things to focus on, such as getting a decent closure standard.
Property syntax in C# is nothing more than syntactic sugar. You don't need it, it's only there as a convenience. The Java people don't like syntactic sugar. That seems to be reason enough for its absence.
Possible arguments based on nothing more than my uninformed opinion
the property syntax in C# is an ugly
hack in that it mixes an
implementation pattern with the
language syntax
It's not really necessary, as it's fairly trivial.
It would adversly affect anyone paid based on lines of code.
I'd actually like there to be some sort of syntactical sugar for properties, as the whole syntax tends to clutter up code that's conceptually extremely simple. Ruby for one seems to do this without much fuss.
On a side note, I've actually tried to write some medium-sized systems (a few dozen classes) without property access, just because of the reduction in clutter and the size of the codebase. Aside from the unsafe design issues (which I was willing to fudge in that case) this is nearly impossible, as every framework, every library, every everything in java auto-discovers properties by get and set methods.They are with us until the very end of time, sort of like little syntactical training wheels.
I would say that it reflects the slowness of change in the language. As a previous commenter mentioned, with most IDEs now, it really is not that big of a deal. But there are no JVM specific reasons for it not to be there.
Might be useful to add to Java, but it's probably not as high on the list as closures.
Personally, I find that a decent IDE makes this a moot point. IntelliJ can generate all the getters/setters for me; all I have to do is embed the behavior that you did into the methods. I don't find it to be a deal breaker.
I'll admit that I'm not knowledgeable about C#, so perhaps those who are will overrule me. This is just my opinion.
If I had to guess, I'd say it has less to do with a philosophical objection to syntactic sugar (they added autoboxing, enhanced for loops, static import, etc - all sugar) than with an issue with backwards compatibility. So far at least, the Java folks have tried very hard to design the new language features in such a way that source-level backwards compatibility is preserved (i.e. code written for 1.4 will still compile, and function, without modification in 5 or 6 or beyond).
Suppose they introduce the properties syntax. What, then does the following mean:
myObj.attr = 5;
It would depend on whether you're talking about code written before or after the addition of the properties feature, and possibly on the definition of the class itself.
I'm not saying these issues couldn't be resolved, but I'm skeptical they could be resolved in a way that led to a clean, unambiguous syntax, while preserving source compatibility with previous versions.
The python folks may be able to get away with breaking old code, but that's not Java's way...
According to Volume 2 of Core Java (Forgotten the authors, but it's a very popular book), the language designers thought it was a poor idea to hide a method call behind field access syntax, and so left it out.
It's the same reason that they don't change anything else in Java - backwards-compatibility.
- Is it because it's not deemed important enough when compared to other possible improvements?
That's my guess.
- Are there technical (e.g. JVM-related) limitations?
No
- Is it a matter of politics? (e.g. "I've been coding in Java for 50 years now and I say: we don't need no steenkin' properties!")
Most likely.
- Is it a case of bikeshedding?
Uh?
One of the main goals of Java was to keep the language simple.
From the: Wikipedia
Java suppresses several features [...] for classes in order to simplify the language and to prevent possible errors and anti-pattern design.
Here are a few little bits of logic that, for me, lead up to not liking properties in a language:
Some programming structures get used because they are there, even if they support bad programming practices.
Setters imply mutable objects. Something to use sparsely.
Good OO design you ask an object to do some business logic. Properties imply that you are asking it for data and manipulating the data yourself.
Although you CAN override the methods in setters and getters, few ever do; also a final public variable is EXACTLY the same as a getter. So if you don't have mutable objects, it's kind of a moot point.
If your variable has business logic associated with it, the logic should GENERALLY be in the class with the variable. IF it does not, why in the world is it a variable??? it should be "Data" and be in a data structure so it can be manipulated by generic code.
I believe Jon Skeet pointed out that C# has a new method for handling this kind of data, Data that should be compile-time typed but should not really be variables, but being that my world has very little interaction with the C# world, I'll just take his word that it's pretty cool.
Also, I fully accept that depending on your style and the code you interact with, you just HAVE to have a set/get situation every now and then. I still average one setter/getter every class or two, but not enough to make me feel that a new programming structure is justified.
And note that I have very different requirements for work and for home programming. For work where my code must interact with the code of 20 other people I believe the more structured and explicit, the better. At home Groovy/Ruby is fine, and properties would be great, etc.
You may not need for "get" and "set" prefixes, to make it look more like properties, you may do it like this:
public class Person {
private String firstName = "";
private Integer age = 0;
public String firstName() { return firstName; } // getter
public void firstName(String val) { firstName = val; } // setter
public Integer age() { return age; } // getter
public void age(Integer val) { age = val; } //setter
public static void main(String[] args) {
Person p = new Person();
//set
p.firstName("Lemuel");
p.age(40);
//get
System.out.println(String.format("I'm %s, %d yearsold",
p.firstName(),
p.age());
}
}