I'm just beginning to learn OOP programming in java. I have already programmed a little in C++, and one of the things I miss the most in Java is the possibility to return multiple values. It's true that C++ functions only strictly return one variable, but we can use the by-reference parameters to return many more. Conversely, in Java we can't do such a thing, at least we can't for primitive types.
The solution I thought off was to create a class grouping the variables I wanted to return and return an instance of that class. For example, I needed to look for an object in a an array and I wanted to return a boolean(found or not) and an index. I know I could make this just setting the index to -1 if nothing was found, but I think it's more clear the other way.
The thing is that I was told by someone who knows much more about Java than I know that I shouldn't create classes for the purpose of returning multiple values ( even if they are related). He told classes should never be used as C++ structs, just to group elements. He also said methods shouldn't return non-primitive objects , they should receive the object from the outside and only modify it. Which of these things are true?
I shouldn't create classes for the purpose of returning multiple values
classes should never be used as C++ structs, just to group elements.
methods shouldn't return non-primitive objects, they should receive the object from the outside and only modify it
For any of the above statements this is definitely not the case. Data objects are useful, and in fact, it is good practice to separate pure data from classes containing heavy logic.
In Java the closest thing we have to a struct is a POJO (plain old java object), commonly known as data classes in other languages. These classes are simply a grouping of data. A rule of thumb for a POJO is that it should only contain primitives, simple types (string, boxed primitives, etc) simple containers (map, array, list, etc), or other POJO classes. Basically classes which can easily be serialized.
Its common to want to pair two, three, or n objects together. Sometimes the data is significant enough to warrant an entirely new class, and in others not. In these cases programmers often use Pair or Tuple classes. Here is a quick example of a two element generic tuple.
public class Tuple2<T,U>{
private final T first;
private final U second;
public Tuple2(T first, U second) {
this.first = first;
this.second = second;
}
public T getFirst() { return first; }
public U getSecond() { return second; }
}
A class which uses a tuple as part of a method signature may look like:
public interface Container<T> {
...
public Tuple2<Boolean, Integer> search(T key);
}
A downside to creating data classes like this is that, for quality of life, we have to implement things like toString, hashCode, equals getters, setters, constructors, etc. For each different sized tuple you have to make a new class (Tuple2, Tuple3, Tuple4, etc). Creating all of these methods introduce subtle bugs into our applications. For these reasons developers will often avoid creating data classes.
Libraries like Lombok can be very helpful for overcoming these challenges. Our definition of Tuple2, with all of the methods listed above, can be written as:
#Data
public class Tuple2<T,U>{
private final T first;
private final U second;
}
This also makes it extremely easy to create custom response classes. Using the custom classes can avoid autoboxing with generics, and increase readability greatly. eg:
#Data
public class SearchResult {
private final boolean found;
private final int index;
}
...
public interface Container<T> {
...
public SearchResult search(T key);
}
methods should receive the object from the outside and only modify it
This is bad advice. It's much nicer to design data around immutability. From Effective Java 2nd Edition, p75
Immutable objects are simple. An immutable object can be in exactly one state, the state in which it was created. If you make sure that all constructors establish class invariants, then it is guaranteed that these invariants will remain true for all time, with no further effort on your part or on the part of the programmer who uses the class. Mutable objects, on the other hand, can have arbitrarily complex state spaces. If the documentation does not provide a precise description of the state transitions performed by mutator methods, it can be difficult or impossible to use a mutable class reliably.
Immutable objects are inherently thread-safe; they require no synchronization. They cannot be corrupted by multiple threads accessing them concurrently. This is far and away the easiest approach to achieving thread safety. In fact, no thread can ever observe any effect of another thread on an immutable object. Therefore, immutable objects can be shared freely.
As to your specific example ("how to return both error status and result?")
I needed to look for an object in a an array and I wanted to return a boolean(found or not) and an index. I know I could make this just setting the index to -1 if nothing was found, but I think it's more clear the other way.
Returning special invalid result values such as -1 for "not found" is indeed very common, and I agree with you that it is not too pretty.
However, returning a tuple of (statusCode, resultValue) is not the only alternative.
The most idiomatic way to report exceptions in Java is to, you guessed it, use exceptions. So return a result or if no result can be produced throw an exception (NoSuchElementException in this case). If this is appropriate depends on the application: You don't want to throw exceptions for "correct" input, it should be reserved for irregular cases.
In functional languages, they often have built-in data structures for this (such as Try, Option or Either) which essentially also do statusCode + resultValue internally, but make sure that you actually check that status code before trying to access the result value. Java now has Optional as well. If I want to go this route, I'd pull in these wrapper types from a library and not make up my own ad-hoc "structs" (because that would only confuse people).
"methods shouldn't return non-primitive objects , they should receive the object from the outside and only modify it"
That may be very traditional OOP thinking, but even within OOP the use of immutable data absolutely has its value (the only sane way to do thread-safe programming in my book), so the guideline to modify stuff in-place is pretty terrible. If something is considered a "data object" (as opposed to "an entity") you should prefer to return modified copies instead of mutating the input.
For some static Information you can use the static final options. Variables, declared as static final, can be accessed from everywhere.
Otherwise it is usual and good practise to use the getter/ setter concept to receive and set parameters in your classes.
Strictly speaking, it is a language limitation that Java does not natively support tuples as return values (see related discussion here). This was done to keep the language cleaner. However, the same decision was made in most other languages. Of course, this was done keeping in mind that, in case of necessity, such a behaviour can be implemented by available means. So here are the options (all of them except the second one allow to combine arbitrary types of return components, not necessarily primitive):
Use classes (usually static, self-made or predefined) specifically designed to contain a group of related values being returned. This option is well covered in other answers.
Combine, if possible, two or more primitive values into one return value. Two ints can be combined into a single long, four bytes can be combined into a single int, boolean and unsigned int less than Integer.MAX_VALUE can be combined into a signed int (look, for example, at how Arrays.binarySearch(...) methods return their results), positive double and boolean can be combined into a single signed double, etc. On return, extract the components via comparisons (if boolean is among them) and bit operations (for shifted integer components).
2a. One particular case worth noting separately. It is common (and widely used) convention to return null to indicate that, in fact, the returned value is invalid. Strictly speaking, this convention substitutes two-field result - one implicit boolean field that you're using when checking
if (returnValue != null)
and the other non-primitive field (which can be just a wrapper of a primitive field) containing the result itself. You use it after the above checking:
ResultClass result = returnValue;
If you don't want to mess with data classes, you can always return an array of Objects:
public Object[] returnTuple() {
return new Object[]{1234, "Text", true};
}
and then typecast its components to desired types:
public void useTuple() {
Object[] t = returnTuple();
int x = (int)t[0];
String s = (String)t[1];
boolean b = (boolean)t[2];
System.out.println(x + ", " + s + ", " + b);
}
You can introduce field(s) into your class to hold auxiliary return component(s) and return only the main component explicitly (you decide which one is the main component):
public class LastResultAware {
public static boolean found;
public static int errorCode;
public static int findLetter(String src, char letter) {
int i = src.toLowerCase().indexOf(Character.toLowerCase(letter));
found = i >= 0;
return i;
}
public static int findUniqueLetter(String src, char letter) {
src = src.toLowerCase();
letter = Character.toLowerCase(letter);
int i = src.indexOf(letter);
if (i < 0)
errorCode = -1; // not found
else {
int j = src.indexOf(letter, i + 1);
if (j >= 0)
errorCode = -2; // ambiguous result
else
errorCode = 0; // success
}
return i;
}
public static void main(String[] args) {
int charIndex = findLetter("ABC", 'b');
if (found)
System.out.println("Letter is at position " + charIndex);
charIndex = findUniqueLetter("aBCbD", 'b');
if (errorCode == 0)
System.out.println("Letter is only at position " + charIndex);
}
}
Note that in some cases it is better to throw an exception indicating an error than to return an error code which the caller may just forget to check.
Depending on usage, this return-extending fields may be either static or instance. When static, they can even be used by multiple classes to serve a common purpose and avoid unnecessary field creation. For example, one public static int errorCode may be enough. Be warned, however, that this approach is not thread-safe.
Related
Say I have a List of object which were defined using lambda expressions (closures). Is there a way to inspect them so they can be compared?
The code I am most interested in is
List<Strategy> strategies = getStrategies();
Strategy a = (Strategy) this::a;
if (strategies.contains(a)) { // ...
The full code is
import java.util.Arrays;
import java.util.List;
public class ClosureEqualsMain {
interface Strategy {
void invoke(/*args*/);
default boolean equals(Object o) { // doesn't compile
return Closures.equals(this, o);
}
}
public void a() { }
public void b() { }
public void c() { }
public List<Strategy> getStrategies() {
return Arrays.asList(this::a, this::b, this::c);
}
private void testStrategies() {
List<Strategy> strategies = getStrategies();
System.out.println(strategies);
Strategy a = (Strategy) this::a;
// prints false
System.out.println("strategies.contains(this::a) is " + strategies.contains(a));
}
public static void main(String... ignored) {
new ClosureEqualsMain().testStrategies();
}
enum Closures {;
public static <Closure> boolean equals(Closure c1, Closure c2) {
// This doesn't compare the contents
// like others immutables e.g. String
return c1.equals(c2);
}
public static <Closure> int hashCode(Closure c) {
return // a hashCode which can detect duplicates for a Set<Strategy>
}
public static <Closure> String asString(Closure c) {
return // something better than Object.toString();
}
}
public String toString() {
return "my-ClosureEqualsMain";
}
}
It would appear the only solution is to define each lambda as a field and only use those fields. If you want to print out the method called, you are better off using Method. Is there a better way with lambda expressions?
Also, is it possible to print a lambda and get something human readable? If you print this::a instead of
ClosureEqualsMain$$Lambda$1/821270929#3f99bd52
get something like
ClosureEqualsMain.a()
or even use this.toString and the method.
my-ClosureEqualsMain.a();
This question could be interpreted relative to the specification or the implementation. Obviously, implementations could change, but you might be willing to rewrite your code when that happens, so I'll answer at both.
It also depends on what you want to do. Are you looking to optimize, or are you looking for ironclad guarantees that two instances are (or are not) the same function? (If the latter, you're going to find yourself at odds with computational physics, in that even problems as simple as asking whether two functions compute the same thing are undecidable.)
From a specification perspective, the language spec promises only that the result of evaluating (not invoking) a lambda expression is an instance of a class implementing the target functional interface. It makes no promises about the identity, or degree of aliasing, of the result. This is by design, to give implementations maximal flexibility to offer better performance (this is how lambdas can be faster than inner classes; we're not tied to the "must create unique instance" constraint that inner classes are.)
So basically, the spec doesn't give you much, except obviously that two lambdas that are reference-equal (==) are going to compute the same function.
From an implementation perspective, you can conclude a little more. There is (currently, may change) a 1:1 relationship between the synthetic classes that implement lambdas, and the capture sites in the program. So two separate bits of code that capture "x -> x + 1" may well be mapped to different classes. But if you evaluate the same lambda at the same capture site, and that lambda is non-capturing, you get the same instance, which can be compared with reference equality.
If your lambdas are serializable, they'll give up their state more easily, in exchange for sacrificing some performance and security (no free lunch.)
One area where it might be practical to tweak the definition of equality is with method references because this would enable them to be used as listeners and be properly unregistered. This is under consideration.
I think what you're trying to get to is: if two lambdas are converted to the same functional interface, are represented by the same behavior function, and have identical captured args, they're the same
Unfortunately, this is both hard to do (for non-serializable lambdas, you can't get at all the components of that) and not enough (because two separately compiled files could convert the same lambda to the same functional interface type, and you wouldn't be able to tell.)
The EG discussed whether to expose enough information to be able to make these judgments, as well as discussing whether lambdas should implement more selective equals/hashCode or more descriptive toString. The conclusion was that we were not willing to pay anything in performance cost to make this information available to the caller (bad tradeoff, punishing 99.99% of users for something that benefits .01%).
A definitive conclusion on toString was not reached but left open to be revisited in the future. However, there were some good arguments made on both sides on this issue; this is not a slam-dunk.
To compare labmdas I usually let the interface extend Serializable and then compare the serialized bytes. Not very nice but works for the most cases.
I don't see a possibility, to get those informations from the closure itself.
The closures doesn't provide state.
But you can use Java-Reflection, if you want to inspect and compare the methods.
Of course that is not a very beautiful solution, because of the performance and the exceptions, which are to catch. But this way you get those meta-informations.
I used to define a set of related constants like Bundle keys together in an interface like below:
public interface From{
String LOGIN_SCREEN = "LoginSCreen";
String NOTIFICATION = "Notification";
String WIDGET = "widget";
}
This provides me a nicer way to group related constants together and used them by making a static import (not implements). I know Android framework also uses the constants in same way like Toast.LENTH_LONG, View.GONE.
However, I often feel that the Java Enums provide much better and powerful way to represent the constant.
But is there a performence issue in using enums on Android?
With a bit of research I ended up in confusion. From this question
"Avoid Enums Where You Only Need Ints” removed from Android's performance tips? it's clear that Google has removed "Avoid enums" from its performance tips, but from it's official training docs Be aware of memory overhead section it clearly says: "Enums often require more than twice as much memory as static constants. You should strictly avoid using enums on Android." Is this still holds good? (say in Java versions after 1.6)
One more issue that I observed is to send enums across intents using Bundle I should send them by serializing (i.e putSerializable(), that I think an expensive operation compared to primitive putString() method, eventhough enums provides it for free).
Can someone please clarify which one is the best way to represent the same in Android? Should I strictly avoid using enums on Android?
Use enum when you need its features. Don't avoid it strictly.
Java enum is more powerful, but if you don't need its features, use constants, they occupy less space and they can be primitive itself.
When to use enum:
type checking - you can accept only listed values, and they are not continuous (see below what I call continuous here)
method overloading - every enum constant has its own implementation of a method
public enum UnitConverter{
METERS{
#Override
public double toMiles(final double meters){
return meters * 0.00062137D;
}
#Override
public double toMeters(final double meters){
return meters;
}
},
MILES{
#Override
public double toMiles(final double miles){
return miles;
}
#Override
public double toMeters(final double miles){
return miles / 0.00062137D;
}
};
public abstract double toMiles(double unit);
public abstract double toMeters(double unit);
}
more data - your one constant contains more than one information that cannot be put in one variable
complicated data - your constant need methods to operate on the data
When not to use enum:
you can accept all values of one type, and your constants contain only these most used
you can accept continuous data
public class Month{
public static final int JANUARY = 1;
public static final int FEBRUARY = 2;
public static final int MARCH = 3;
...
public static String getName(final int month){
if(month <= 0 || month > 12){
throw new IllegalArgumentException("Invalid month number: " + month);
}
...
}
}
for names (like in your example)
for everything else that really doesn't need an enum
Enums occupy more space
a single reference to an enum constant occupies 4 bytes
every enum constant occupies space that is a sum of its fields' sizes aligned to 8 bytes + overhead of the object
the enum class itself occupies some space
Constants occupy less space
a constant doesn't have a reference so it's a pure data (even if it's a reference, then enum instance would be a reference to another reference)
constants may be added to an existing class - it's not necessary to add another class
constants may be inlined; it brings extended compile-time features (such as null checking, finding dead code etc.)
If the enums simply have values, you should try to use IntDef/StringDef , as shown here:
https://developer.android.com/studio/write/annotations.html#enum-annotations
Example: instead of :
enum NavigationMode {NAVIGATION_MODE_STANDARD, NAVIGATION_MODE_LIST, NAVIGATION_MODE_TABS}
you use:
#IntDef({NAVIGATION_MODE_STANDARD, NAVIGATION_MODE_LIST, NAVIGATION_MODE_TABS})
#Retention(RetentionPolicy.SOURCE)
public #interface NavigationMode {}
public static final int NAVIGATION_MODE_STANDARD = 0;
public static final int NAVIGATION_MODE_LIST = 1;
public static final int NAVIGATION_MODE_TABS = 2;
and in the function that has it as a parameter/returned value , use:
#NavigationMode
public abstract int getNavigationMode();
public abstract void setNavigationMode(#NavigationMode int mode);
In case the enum is complex, use an enum. It's not that bad.
To compare enums vs constant values, you should read here:
http://hsc.com/Blog/Best-Practices-For-Memory-Optimization-on-Android-1
Their example is of an enum with 2 values. It takes 1112 bytes in dex file compared to 128 bytes when constant integers are used . Makes sense, as enums are real classes, as opposed to how it works on C/C++ .
With Android P, google has no restriction/objection in using enums
The documentation has changed where before it was recommended to be cautious but it doesn't mention it now.
https://developer.android.com/reference/java/lang/Enum
In addition to previous answers, I would add that if you are using Proguard (and you should definitely do it to reduce size and obfuscate your code), then your Enums will be automatically converted to #IntDef wherever it is possible:
https://www.guardsquare.com/en/proguard/manual/optimizations
class/unboxing/enum
Simplifies enum types to integer constants, whenever possible.
Therefore, if you have some discrete values and some method should allow to take only this values and not others of the same type, then I would use Enum, because Proguard will make this manual work of optimizing code for me.
And here is a good post about using enums from Jake Wharton, take a look at it.
As a library developer, I recognize these small optimizations that should be done as we want to have as little impact on the consuming app's size, memory, and performance as possible. But it's important to realize that [...] putting an enum in your public API vs. integer values where appropriate is perfectly fine. Knowing the difference to make informed decisions is what's important
Should I strictly avoid using enums on Android?
No. "Strictly" means they are so bad, they should not be used at all. Possibly a performance issues might arise in an extreme situation like many many many (thousands or millions of) operations with enums (consecutive on the ui thread). Far more common are the network I/O operations that should strictly happen in a background thread.
The most common usage of enums is probably some kind of type check - whether an object is this or that which is so fast you won't be able to notice a difference between a single comparison of enums and a comparison of integers.
Can someone please clarify which one is the best way to represent the same in Android?
There is no general rule of thumb for this. Use whatever works for you and helps you get your app ready. Optimize later - after you notice there's a bottleneck that slows some aspect of your app.
I like to add, that you can not use #Annotations when you declare a List<> or Map<> where either key or value is of one of your annotation interfaces.
You get the error "Annotations are not allowed here".
enum Values { One, Two, Three }
Map<String, Values> myMap; // This works
// ... but ...
public static final int ONE = 1;
public static final int TWO = 2;
public static final int THREE = 3;
#Retention(RetentionPolicy.SOURCE)
#IntDef({ONE, TWO, THREE})
public #interface Values {}
Map<String, #Values Integer> myMap; // *** ERROR ***
So when you need to pack it into a list/map, use enum, as they can be added, but #annotated int/string groups can not.
Two facts.
1, Enum is one of the most powerful feature in JAVA.
2, Android phone usually has a LOT of memory.
So my answer is NO. I will use Enum in Android.
Suppose you have a written a class and have used lazy initialization to assign one of its fields. Suppose that the computation for that field only involves the other fields and is guaranteed to produce the same result every time. When two equal instances of the class encounter one another, it makes sense for them to share the value of the lazily initialized field (if either knows it). You could do this in the equals() method. Here is a class showing what I mean.
final class MyClass {
private final int number;
private String string;
MyClass(int number) {
this.number = number;
}
String getString() {
if (string == null) {
string = OtherClass.expensiveCalculation(number);
}
return string;
}
#Override
public boolean equals(Object object) {
if (object == this) { return true; }
if (!(object instanceof MyClass)) { return false; }
MyClass that = (MyClass) object;
if (that.number != number) { return false; }
String thatString = that.string;
if (string == null && thatString != null) {
string = thatString;
} else if (thatString == null && string != null) {
that.string = string;
}
return true;
}
#Override
public int hashCode() { return number; }
}
To me, this information-sharing seems the logical thing to do if you are going to go to the effort of lazily initializing a field, yet I have never seen an example of anyone using the equals() method in this way.
Is it a common or standard technique? If so, what is it called? If it is not a common technique, can I ask (at the risk of having the question put on hold as primarily opinion-based) what people think about it? Is it a good idea to use the equals() method to do anything other than check for equality?
This looks dangerous to me: the use of a side affect of a public method of Object to set an object's state. This will break if you subclass this class, and then override the subclass's equals method, a common thing to do. Just don't do this.
"Suppose that the computation for that field only involves the other fields and is guaranteed to produce the same result every time."
Given this supposition, you can assert that the value of the lazily initialized field does not matter because if the values of the other fields are the same, the calculated value will also be the same.
Edit
I guess I sidestepped the original question, so I'll answer that too. In the scenario you've created, there is nothing inherently wrong with what you're proposing.
The argument I would make is simply from a pragmatic standpoint: what happens when someone else is changing the definition of getString() (or more likely - changing the definition of the long running calculation that results in that value) and it starts relying on something that's not part of the object's equality considerations?
The reason conventional wisdom says that equals() should be side effect free is that most developers expect it to be side effect free.
I would not do this, for three reasons:
General software-engineering principles, such as cohesion, loose coupling, and "don't repeat yourself", militate against it: your equals(...) method will be doing something not very "equals"-y, that overlaps with the logic of your getString() method. Someone updating the logic of getString() might well fail to notice if they also need to update the logic of equals(...). (You might think that the logic of equals(...) will continue to be correct no matter how getString() is changed — after all, you're just having equals(...) copy the reference from one object to an equivalent one, so presumably that should always stay the same? — but the problem is that complex systems evolve in ways that you can't always predict in advance. When a requirement changes, you don't want to have make random changes in parts of the code that aren't obviously related to the requirement.)
Thread-safety. Your string field currently isn't volatile, and your getString() method currently isn't synchronized, so there's no attempt at thread-safety here anyway; but if you were to make the rest of the class thread-safe, it would not be perfectly straightforward to change equals(...) to be thread-safe without risking deadlocks. (This overlaps a bit with point #1, but I'm listing it separately because #1 is solely about the difficulty of knowing that you have to change equals(...), whereas this issue is a bit tricky to address even given that knowledge.)
Unlikelihood of usefulness. There's not much reason to expect it to happen very often that two instances get equals(...)-compared when one has already been lazy-initialized and the other has not; so the extra code complexity, and downsides mentioned above, are not likely to be worth it. (Remember: code is not free. In order to pass cost–benefit analysis, the benefits of a piece of code must exceed the costs of testing, understanding, maintaining, and supporting it in the future.) If it's worthwhile to share these lazy-initialized values between equivalent instances, then that should be done in a clearer and more-organized fashion that does not rely on happenstance. (For example, you might make the class's constructor private, and have a static factory-method that checks a static WeakHashMap for an existing instance before creating and returning a new one.)
The approach you describe is sometimes a good one, especially in situations where it is likely that many large immutable objects, despite being independently constructed, will end up being identical. Because it is much faster to compare equal references than to compare large objects which happen to be equal, it may be advantageous to have code which compares two large-objects and finds them to be identical replace one of the references with a reference to the other. For this to be workable, one should attempt to establish some sort of ordering among the objects in question to ensure that repeated comparisons will eventually yield the same canonical value. This could be accomplished by having objects include a long sequence number and consistently replacing references to newer values with references to older-but-equal values, or by comparing the identityHashCode value of the equal references and discarding whichever one, if any, has the lower value (if two references which identify distinct but identical instances, happen to report the same identityHashCode, both should be kept).
A nasty but unfortunate wrinkle in this is that Java has very poor multi-threading support for effectively-immutable objects. For an effectively-immutable object to be thread-safe, any access to an array or non-final field must go through a final field. The cheapest way of accomplishing that is probably to have the object contain a final field into which it stores a reference to itself, and have all methods which access non-final fields do so through that final field, but that's a bit ugly. Still, changing references distinct-but-identical references with references to the same object could offer some significant performance advantages despite the silly redundant final field accesses (since the target of the final field would be guaranteed to be in-cache, dereferencing it would be much cheaper than a normal dereference).
BTW, it would in many cases be possible to include an "equivalence-relation" mechanism such that once some objects were compared and found to be equal, discovering that any of them is equal to another object would cause all of them to be quickly recognizable as such. I haven't figured out how to avoid the possibility of a deliberately-nasty-but-legitimate usage pattern causing a memory leak, however.
Java programmers and API seems to favor explicit set/get methods.
however I got the impression C++ community frowns upon such practice.
If it is so,is there a particular reason (besides more lines of code) why this is so?
on the other hand, why does Java community choose to use methods rather than direct access?
Thank you
A well designed class should ideally not have too many gets and sets. In my opinion, too many gets and sets are basically an indication of the fact that someone else (and potentially many of them) need my data to achieve their purpose. In that case, why does that data belong to me in the first place? This violates the basic principle of encapsulation (data + operations in one logical unit).
So, while there is no technical restriction and (in fact abundance of) 'set' and 'get' methods, I would say that you should pause and reinspect your design if you want too many of those 'get' and 'set' in your class interface used by too many other entities in your system.
There are occasions when getters/setters are appropriate but an abundance of getters/setters typically indicate that your design fails to achieve any higher level of abstraction.
Typically it's better (in regards to encapsulation) to exhibit higher level operations for your objects that does not make the implementation obvious to the user.
Some other possible reasons why it's not as common in C++ as in Java:
The Standard Library does not use it.
Bjarne Stroustrup expresses his dislike towards it (last paragraph):
I particularly dislike classes with a
lot of get and set functions. That is
often an indication that it shouldn't
have been a class in the first place.
It's just a data structure. And if it
really is a data structure, make it a
data structure.
The usual argument against get/set methods is that if you have both and they're just trivial return x; and x = y; then you haven't actually encapsulated anything at all; you may as well just make the member public which saves a whole lot of boilerplate code.
Obviously there are cases where they still make sense; if you need to do something special in them, or you need to use inheritance or, particularly, interfaces.
There is the advantage that if you implement getters/setters you can change their implementation later without having to alter code that uses them. I suppose the frowning on it you refer to is kind of a YAGNI thing that if there's no expectation of ever altering the functions that way, then there's little benefit to having them. In many cases you can just deal with the case of altering the implementation later anyway.
I wasn't aware that the C++ community frowned on them any more or less than the Java community; my impression is that they're rather less common in languages like Python, for example.
I think the reason the C++ community frowns on getters and setters is that C++ offers far better alternatives. For example:
template <class T>
class DefaultPredicate
{
public:
static bool CheckSetter (T value)
{
return true;
}
static void CheckGetter (T value)
{
}
};
template <class T, class Predicate = DefaultPredicate <T>>
class Property
{
public:
operator T ()
{
Predicate::CheckGetter (m_storage);
return m_storage;
}
Property <T, Predicate> &operator = (T rhs)
{
if (Predicate::CheckSetter (rhs))
{
m_storage = rhs;
}
return *this;
}
private:
T m_storage;
};
which can then be used like this:
class Test
{
public:
Property <int> TestData;
Property <int> MoreTestData;
};
int main ()
{
Test
test;
test.TestData = 42;
test.MoreTestData = 24;
int value = test.TestData;
bool check = test.TestData == test.MoreTestData;
}
Notice that I added a predicate parameter to the property class. With this, we can get creative, for example, a property to hold an integer colour channel value:
class NoErrorHandler
{
public:
static void SignalError (const char *const error)
{
}
};
class LogError
{
public:
static void SignalError (const char *const error)
{
std::cout << error << std::endl;
}
};
class Exception
{
public:
Exception (const char *const message) :
m_message (message)
{
}
operator const char *const ()
{
return m_message;
}
private:
const char
*const m_message;
};
class ThrowError
{
public:
static void SignalError (const char *const error)
{
throw new Exception (error);
}
};
template <class ErrorHandler = NoErrorHandler>
class RGBValuePredicate : public DefaultPredicate <int>
{
public:
static bool CheckSetter (int rhs)
{
bool
setter_ok = true;
if (rhs < 0 || rhs > 255)
{
ErrorHandler::SignalError ("RGB value out of range.");
setter_ok = false;
}
return setter_ok;
}
};
and it can be used like this:
class Test
{
public:
Property <int, RGBValuePredicate <> > RGBValue1;
Property <int, RGBValuePredicate <LogError> > RGBValue2;
Property <int, RGBValuePredicate <ThrowError> > RGBValue3;
};
int main ()
{
Test
test;
try
{
test.RGBValue1 = 4;
test.RGBValue2 = 5;
test.RGBValue3 = 6;
test.RGBValue1 = 400;
test.RGBValue2 = 500;
test.RGBValue3 = -6;
}
catch (Exception *error)
{
std::cout << "Exception: " << *error << std::endl;
}
}
Notice that I made the handling of bad values a template parameter as well.
Using this as a starting point, it can be extended in many different ways.
For example, allow the storage of the property to be different to the public type of the value - so the RGBValue above could use an unsigned char for storage but an int interface.
Another example is to change the predicate so that it can alter the setter value. In the RGBValue above this could be used to clamp values to the range 0 to 255 rather than generate an error.
Properties as a general language concept technically predate C++, e.g. in Smalltalk, but they weren't ever part of the standard. Getters and setters were a concept used in C++ when it was used for development of UI's, but truth be told, it's an expensive proposition to develop UI's in what is effectively a systems language. The general problem with getters and setters in C++ was that, since they weren't a standard, everybody had a different standard.
And in systems languages, where efficiency concerns are high, then it's just easier to make the variable itself public, although there's a lot of literature that frowns mightily on that practice. Often, you simply see richer exchanges of information between C++ object instances than simple items.
You'll probably get a lot of viewpoints in response to this question, but in general, C++ was meant to be C that did objects, making OOP accessable to developers that didn't know objects. It was hard enough to get virtuals and templates into the language, and I think that it's been kind of stagnant for a while.
Java differs because in the beginning, with what Java brought in areas like garbage collection, it was easier to promote the philosophy of robust encapsulation, i.e. external entities should keep their grubby little paws off of internal elements of a class.
I admit this is pretty much opinion - at this time I use C++ for highly optimized stuff like 3D graphics pipelines - I already have to manage all my object memory, so I'd take a dim view of fundamentally useless code that just serves to wrap storage access up in additional functions - that said, the basic performance capabilies of runtimes like the MSFT .net ILM make that a position that can be difficult to defend at times
Purely my 2c
There's nothing unusual about having explicit set/get methods in C++. I've seen it in plenty of C++, it can be very useful to not allow direct access to data members.
Check out this question for an explanation of why Java tends to prefer them and the reasons for C++ are the same. In short: it allows you to change the way data members are accessed without forcing client code (code that uses your code) to recompile. It also allows you to enforce a specific policy for how to access data and what to do when that data is accessed.
By mandating the use of set/get methods, one can implement useful side-effects in the getter/setter (for example, when the argument to get/set is an object).
I am surprised nobody has mentioned Java introspection and beans yet.
Using get.../set... naming convention combined with introspection allows all sorts of clever trickery with utility classes.
I personally feel that the "public" keyword should have been enough to trigger the bean magic but I am not Ray Gosling.
My take on this is that in C++ is a rather pointless exercise. You are adding at least six lines of code to test and maintain which perform no purpose and will for the most part be ignored by the compiler. It doesnt really protect your class from misuse and abuse unless you add a lot more coding.
I don't think the C++ community frowned on using getters and setters. They are almost always a good idea.
It has to do with the basics of object oriented programming - hiding the internals of an object from its users. The users of an object should not need to know (nor should they care) about the internals of an object.
It also gives you control over what is done whenever a user of your object tries to read/write to it. In effect, you expose an interface to the object's users. They have to use that interface and you control what happens when methods in that interface are called - the getters and setters would be part of the interface.
It just makes things easier when debugging. A typical scenario is when your object lands up in a weird state and you're debugging to find out how it got there. All you do is set breakpoints in your getters and setters and assuming all else is fine, you're able to see how your object gets to the weird state. If your object's users are all directly accessing its members, figuring out when your object's state changes becomes a lot harder (though not impossible)
I would argue that C++ needs getters/setters more than Java.
In Java, if you start with naked field access, and later you changed your mind, you want getter/setter instead, it is extremely easy to find all the usages of the field, and refactor them into getter/setter.
in C++, this is not that easy. The language is too complex, IDEs simply can't reliably do that.
so In C++, you better get it right the first time. In Java, you can be more adventurous.
There were gets/sets long before java. There are many reasons to use them, especially, if you have to recalculate sth. wenn a value changes. So the first big advantage is, that you can watch to value changes. But imho its bad to ALWAYS implement get and set-often a get is enough. Another point is, that class changes will directly affect your customers. You cant change member names without forcing to refactor the clients code with public members. Lets say, you have an object with a lenght and you change this member name...uh. With a getter, you just change you side of the code and the client can sleep well. Adding gets/Sets for members that should be hidden is of course nonsense.
Since arguments sent to a method in Java point to the original data structures in the caller method, did its designers intend for them to used for returning multiple values, as is the norm in other languages like C ?
Or is this a hazardous misuse of Java's general property that variables are pointers ?
A long time ago I had a conversation with Ken Arnold (one time member of the Java team), this would have been at the first Java One conference probably, so 1996. He said that they were thinking of adding multiple return values so you could write something like:
x, y = foo();
The recommended way of doing it back then, and now, is to make a class that has multiple data members and return that instead.
Based on that, and other comments made by people who worked on Java, I would say the intent is/was that you return an instance of a class rather than modify the arguments that were passed in.
This is common practice (as is the desire by C programmers to modify the arguments... eventually they see the Java way of doing it usually. Just think of it as returning a struct. :-)
(Edit based on the following comment)
I am reading a file and generating two
arrays, of type String and int from
it, picking one element for both from
each line. I want to return both of
them to any function which calls it
which a file to split this way.
I think, if I am understanding you correctly, tht I would probably do soemthing like this:
// could go with the Pair idea from another post, but I personally don't like that way
class Line
{
// would use appropriate names
private final int intVal;
private final String stringVal;
public Line(final int iVal, final String sVal)
{
intVal = iVal;
stringVal = sVal;
}
public int getIntVal()
{
return (intVal);
}
public String getStringVal()
{
return (stringVal);
}
// equals/hashCode/etc... as appropriate
}
and then have your method like this:
public void foo(final File file, final List<Line> lines)
{
// add to the List.
}
and then call it like this:
{
final List<Line> lines;
lines = new ArrayList<Line>();
foo(file, lines);
}
In my opinion, if we're talking about a public method, you should create a separate class representing a return value. When you have a separate class:
it serves as an abstraction (i.e. a Point class instead of array of two longs)
each field has a name
can be made immutable
makes evolution of API much easier (i.e. what about returning 3 instead of 2 values, changing type of some field etc.)
I would always opt for returning a new instance, instead of actually modifying a value passed in. It seems much clearer to me and favors immutability.
On the other hand, if it is an internal method, I guess any of the following might be used:
an array (new Object[] { "str", longValue })
a list (Arrays.asList(...) returns immutable list)
pair/tuple class, such as this
static inner class, with public fields
Still, I would prefer the last option, equipped with a suitable constructor. That is especially true if you find yourself returning the same tuple from more than one place.
I do wish there was a Pair<E,F> class in JDK, mostly for this reason. There is Map<K,V>.Entry, but creating an instance was always a big pain.
Now I use com.google.common.collect.Maps.immutableEntry when I need a Pair
See this RFE launched back in 1999:
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4222792
I don't think the intention was to ever allow it in the Java language, if you need to return multiple values you need to encapsulate them in an object.
Using languages like Scala however you can return tuples, see:
http://www.artima.com/scalazine/articles/steps.html
You can also use Generics in Java to return a pair of objects, but that's about it AFAIK.
EDIT: Tuples
Just to add some more on this. I've previously implemented a Pair in projects because of the lack within the JDK. Link to my implementation is here:
http://pbin.oogly.co.uk/listings/viewlistingdetail/5003504425055b47d857490ff73ab9
Note, there isn't a hashcode or equals on this, which should probably be added.
I also came across this whilst doing some research into this questions which provides tuple functionality:
http://javatuple.com/
It allows you to create Pair including other types of tuples.
You cannot truly return multiple values, but you can pass objects into a method and have the method mutate those values. That is perfectly legal. Note that you cannot pass an object in and have the object itself become a different object. That is:
private void myFunc(Object a) {
a = new Object();
}
will result in temporarily and locally changing the value of a, but this will not change the value of the caller, for example, from:
Object test = new Object();
myFunc(test);
After myFunc returns, you will have the old Object and not the new one.
Legal (and often discouraged) is something like this:
private void changeDate(final Date date) {
date.setTime(1234567890L);
}
I picked Date for a reason. This is a class that people widely agree should never have been mutable. The the method above will change the internal value of any Date object that you pass to it. This kind of code is legal when it is very clear that the method will mutate or configure or modify what is being passed in.
NOTE: Generally, it's said that a method should do one these things:
Return void and mutate its incoming objects (like Collections.sort()), or
Return some computation and don't mutate incoming objects at all (like Collections.min()), or
Return a "view" of the incoming object but do not modify the incoming object (like Collections.checkedList() or Collections.singleton())
Mutate one incoming object and return it (Collections doesn't have an example, but StringBuilder.append() is a good example).
Methods that mutate incoming objects and return a separate return value are often doing too many things.
There are certainly methods that modify an object passed in as a parameter (see java.io.Reader.read(byte[] buffer) as an example, but I have not seen parameters used as an alternative for a return value, especially with multiple parameters. It may technically work, but it is nonstandard.
It's not generally considered terribly good practice, but there are very occasional cases in the JDK where this is done. Look at the 'biasRet' parameter of View.getNextVisualPositionFrom() and related methods, for example: it's actually a one-dimensional array that gets filled with an "extra return value".
So why do this? Well, just to save you having to create an extra class definition for the "occasional extra return value". It's messy, inelegant, bad design, non-object-oriented, blah blah. And we've all done it from time to time...
Generally what Eddie said, but I'd add one more:
Mutate one of the incoming objects, and return a status code. This should generally only be used for arguments that are explicitly buffers, like Reader.read(char[] cbuf).
I had a Result object that cascades through a series of validating void methods as a method parameter. Each of these validating void methods would mutate the result parameter object to add the result of the validation.
But this is impossible to test because now I cannot stub the void method to return a stub value for the validation in the Result object.
So, from a testing perspective it appears that one should favor returning a object instead of mutating a method parameter.