Unit testing code that relies on constant values - java

Consider the following (totally contrived) example:
public class Length {
private static final int MAX_LENGTH = 10;
private final int length;
public Length(int length) {
if (length > MAX_LENGTH)
throw new IllegalArgumentException("Length too long");
this.length = length;
}
}
I would like to test that this throws an exception when called with a length greater than MAX_LENGTH. There are a number of ways this can be tested, all with disadvantages:
#Test(expected = IllegalArgumentException.class)
public void testMaxLength() {
new Length(11);
}
This replicates the constant in the testing case. If MAX_LENGTH becomes smaller this will silently no longer be an edge case (though clearly it should be paired with a separate case to test the other side of the edge). If it becomes larger this will fail and need to be changed manually (which might not be a bad thing).
These disadvantages can be avoided by adding a getter for MAX_LENGTH and then changing the test to:
new Length(Length.getMaxLength());
This seems much better as the test does not need to be changed if the constant changes. On the other hand it is exposing a constant that would otherwise be private and it has the significant flaw of testing two methods at once - the test might give a false positive if both methods are broken.
An alternative approach is to not use a constant at all but, rather, inject the dependency:
interface MaxLength {
int getMaxLength();
}
public class Length {
public static void setMaxLength(MaxLength maxLength);
}
Then the 'constant' can be mocked as part of the test (example here using Mockito):
MaxLength mockedLength = mock(MaxLength.class);
when(mokedLength.getMaxLength()).thenReturn(17);
Length.setMaxLength(mockedLength);
new Length(18);
This seems to be adding a lot of complexity for not a lot of value (assuming there's no other reason to inject the dependency).
At this stage my preference is to use the second approach of exposing the constants rather than hardcoding the values in the test. But this does not seem ideal to me. Is there a better alternative? Or is the lack of testability of these cases demonstrating a design flaw?

As Tim alluded to in the comments, your goal is to make sure that your software behaves according to the specifications. One such specification might be that the maximum length is always 10, at which point it'd be unnecessary to test a world where length is 5 or 15.
Here's the question to ask yourself: How likely is it that you'll want to use your class with a different value of the "constant"? I've quoted "constant" here because if you vary the value programmatically, it's not really a constant at all, is it? :)
If your value will never ever change, you could not use a symbolic constant at all, just comparing to 10 directly and testing based on (say) 0, 3, 10, and 11. This might make your code and tests a little hard to understand ("Where did the 10 come from? Where did the 11 come from?"), and will certainly make it hard to change if you ever do have reason to vary the number. Not recommended.
If your value will probably never change, you could use a private named constant (i.e. a static final field), as you have. Then your code will be easy enough to change, though your tests won't be able to automatically adjust the way your code would.
You could also relax to package-private visibility, which would be available to tests in the same package. Javadoc (e.g. /** Package-private for testing. */) or documentation annotations (e.g. #VisibleForTesting) may help make your intentions clear. This is a nice option if your constant value is intended to be opaque and unavailable outside of your class, like an URL template or authentication token.
You could even make it a public constant, which would be available to consumers of your class as well. For your example of a constant Length, a public static final field is probably best, on the assumption that other pieces of your system may want to know about that (e.g. for UI validation hints or error messages).
If your value is likely to change you could accept it per-instance, as in new Length(10) or new Length().setMaxLength(10). (I consider the former to be a form of dependency injection, counting the constant integer as a dependency.) This is also a good idea if you wanted to use a different value in tests, such as using a maximum length of 2048 in production but testing against 10 for practicality's sake. To make a flexible length validator, this option is probably a good upgrade from a static final field.
Only if your value is likely to change during your instance's lifetime would I bother with a DI-style value provider. At that point, you can query the value interactively, so it doesn't behave at all like a constant. For "length", that'd be obvious overkill, but maybe not for "maximum allowed memory", "maximum simultaneous connections", or some other pseudo-constants like that.
In short, you'll have to decide how much control you need, and then you can pick the most straightforward choice from there; as a "default", you may want to make it a visible field or constructor parameter, as those tend to have good balance of simplicity and flexibility.

Related

Check list size with a magic number (1) or global constant?

I'm in an argument with a co-worker about the following code:
private static final byte ONE_ELEMENT = 1;
private boolean isListSizeEqualsOne(List<MyClass> myList) {
return myList.size() == ONE_ELEMENT;
}
I'm arguing that this kind of code admittedly reduces a warning about a magic number but unnecessarily increases clutter at the same time. I'm suggesting to inline the global variable instead:
private boolean isListSizeEqualsOne(List<MyClass> myList) {
return myList.size() == 1;
}
Is there any literature for / against this example?
I think the problem with the code is already in the method itself. Just like comments, a method name should not indicate what the code does, but why. In other words, it should indicate the functionality it provides, not its implementation.
That is, it should express the role that this method plays in the system. so instead of the name isListSizeEqualsOne, use a name that indicates the "why". For example resultIsUnique, or errorReturned (if you use an API where a list with a single element indicates an error).
Then the naming of the constant follows:
resultIsUnique: constant UNIQUE_RESULT_COUNT=1
errorReturned: constant ERROR_RESULT_COUNT=1
Finally, I don't think it is a good idea to enable warnings for inline constants. Using named constants for numbers only makes sense if either
the value must be the same everywhere (e.g. magic number for a file format), or
the value needs a name to be obvious, such as mathematical constants
If you need constants whose meaning is obvious (such as checking for an empty list by comparing the size to zero), then I think a plain inline value is perfectly ok.

Which approach shows better performance: encapsulating into a method or not?

While I am writing the code sometimes I bump in the situation when I need to choose whether I should create a separate method (the advantage is that I can use my own syntax later) or implement the complex method which already exists (also less lines of the code).
Here are the examples using different programming languages (Objective-C and Java) to explain the question.
Objective-C example:
-(double) maxValueFinder: (NSMutableArray *)data {
double max = [[data valueForKeyPath:#"#max.intValue"] doubleValue];
return maxValue;
}
then later:
...
double max = [self maxValueFinder:data];
...
or just every time try to call:
...
double max = [[data valueForKeyPath:#"#max.intValue"] doubleValue];
...
Java example:
public static double maxFinder (ArrayList<Double> data) {
double maxValue = Collections.max(data);
return maxValue;
}
then later:
...
double max = maxFinder(data);
...
or just every time try to call:
...
double max = Collections.max(data);
...
or more complex case to make the point of my question sharper:
//using jsoup
public static Element getElement(Document content){
Element link = content.getElementsByTag("a").first();
return link;
}
or every time:
...
Element link = content.getElementsByTag("a").first();
...
Which approach cost less resources (performance, memory) or it is the same?
It absolutely doesn't matter. At least in your Java case you're uselessly recreating existing functionality, which is ridiculous.
You should first see if the functionality is contained in the standard library, then see if existing well known libraries have it, and only after that should you consider writing implementations yourself (especially for more complex functionality).
Performance has nothing to do with your question, except in the sense that the more time you spend on recreating existing functionality, the less time you have left for actual new code (therefore lowering your programming performance).
As for creating wrapper methods, that can be useful in some cases, especially if the actual method calls are often chained and you find yourself having more and more of those in the code. But there's a delicate difference between code clarity and writing excessive code.
public void parseHtml() {
parseFirstPart();
parseSecondPart();
parseThirdPart();
}
If we assume that each parse method only contains 1 or maybe 2 method calls then adding these additional methods is most likely useless, since the same thing can be achieved by proper commenting. If the parse methods contain a lot of calls, it makes sense to extract methods out of them. There's no rule about it, it's a skill you learn while you program (and of course depends a lot on what you view as beautiful code.
It's absolutely useless to recreating existing functionality.
Because these function is already implement in library.
If you talk about performance then both cases you are loading same line
double maxValue = Collections.max(data);
Performance is not matter in both cases because you are loading same code.

Should I strictly avoid using enums on Android?

I used to define a set of related constants like Bundle keys together in an interface like below:
public interface From{
String LOGIN_SCREEN = "LoginSCreen";
String NOTIFICATION = "Notification";
String WIDGET = "widget";
}
This provides me a nicer way to group related constants together and used them by making a static import (not implements). I know Android framework also uses the constants in same way like Toast.LENTH_LONG, View.GONE.
However, I often feel that the Java Enums provide much better and powerful way to represent the constant.
But is there a performence issue in using enums on Android?
With a bit of research I ended up in confusion. From this question
"Avoid Enums Where You Only Need Ints” removed from Android's performance tips? it's clear that Google has removed "Avoid enums" from its performance tips, but from it's official training docs Be aware of memory overhead section it clearly says: "Enums often require more than twice as much memory as static constants. You should strictly avoid using enums on Android." Is this still holds good? (say in Java versions after 1.6)
One more issue that I observed is to send enums across intents using Bundle I should send them by serializing (i.e putSerializable(), that I think an expensive operation compared to primitive putString() method, eventhough enums provides it for free).
Can someone please clarify which one is the best way to represent the same in Android? Should I strictly avoid using enums on Android?
Use enum when you need its features. Don't avoid it strictly.
Java enum is more powerful, but if you don't need its features, use constants, they occupy less space and they can be primitive itself.
When to use enum:
type checking - you can accept only listed values, and they are not continuous (see below what I call continuous here)
method overloading - every enum constant has its own implementation of a method
public enum UnitConverter{
METERS{
#Override
public double toMiles(final double meters){
return meters * 0.00062137D;
}
#Override
public double toMeters(final double meters){
return meters;
}
},
MILES{
#Override
public double toMiles(final double miles){
return miles;
}
#Override
public double toMeters(final double miles){
return miles / 0.00062137D;
}
};
public abstract double toMiles(double unit);
public abstract double toMeters(double unit);
}
more data - your one constant contains more than one information that cannot be put in one variable
complicated data - your constant need methods to operate on the data
When not to use enum:
you can accept all values of one type, and your constants contain only these most used
you can accept continuous data
public class Month{
public static final int JANUARY = 1;
public static final int FEBRUARY = 2;
public static final int MARCH = 3;
...
public static String getName(final int month){
if(month <= 0 || month > 12){
throw new IllegalArgumentException("Invalid month number: " + month);
}
...
}
}
for names (like in your example)
for everything else that really doesn't need an enum
Enums occupy more space
a single reference to an enum constant occupies 4 bytes
every enum constant occupies space that is a sum of its fields' sizes aligned to 8 bytes + overhead of the object
the enum class itself occupies some space
Constants occupy less space
a constant doesn't have a reference so it's a pure data (even if it's a reference, then enum instance would be a reference to another reference)
constants may be added to an existing class - it's not necessary to add another class
constants may be inlined; it brings extended compile-time features (such as null checking, finding dead code etc.)
If the enums simply have values, you should try to use IntDef/StringDef , as shown here:
https://developer.android.com/studio/write/annotations.html#enum-annotations
Example: instead of :
enum NavigationMode {NAVIGATION_MODE_STANDARD, NAVIGATION_MODE_LIST, NAVIGATION_MODE_TABS}
you use:
#IntDef({NAVIGATION_MODE_STANDARD, NAVIGATION_MODE_LIST, NAVIGATION_MODE_TABS})
#Retention(RetentionPolicy.SOURCE)
public #interface NavigationMode {}
public static final int NAVIGATION_MODE_STANDARD = 0;
public static final int NAVIGATION_MODE_LIST = 1;
public static final int NAVIGATION_MODE_TABS = 2;
and in the function that has it as a parameter/returned value , use:
#NavigationMode
public abstract int getNavigationMode();
public abstract void setNavigationMode(#NavigationMode int mode);
In case the enum is complex, use an enum. It's not that bad.
To compare enums vs constant values, you should read here:
http://hsc.com/Blog/Best-Practices-For-Memory-Optimization-on-Android-1
Their example is of an enum with 2 values. It takes 1112 bytes in dex file compared to 128 bytes when constant integers are used . Makes sense, as enums are real classes, as opposed to how it works on C/C++ .
With Android P, google has no restriction/objection in using enums
The documentation has changed where before it was recommended to be cautious but it doesn't mention it now.
https://developer.android.com/reference/java/lang/Enum
In addition to previous answers, I would add that if you are using Proguard (and you should definitely do it to reduce size and obfuscate your code), then your Enums will be automatically converted to #IntDef wherever it is possible:
https://www.guardsquare.com/en/proguard/manual/optimizations
class/unboxing/enum
Simplifies enum types to integer constants, whenever possible.
Therefore, if you have some discrete values and some method should allow to take only this values and not others of the same type, then I would use Enum, because Proguard will make this manual work of optimizing code for me.
And here is a good post about using enums from Jake Wharton, take a look at it.
As a library developer, I recognize these small optimizations that should be done as we want to have as little impact on the consuming app's size, memory, and performance as possible. But it's important to realize that [...] putting an enum in your public API vs. integer values where appropriate is perfectly fine. Knowing the difference to make informed decisions is what's important
Should I strictly avoid using enums on Android?
No. "Strictly" means they are so bad, they should not be used at all. Possibly a performance issues might arise in an extreme situation like many many many (thousands or millions of) operations with enums (consecutive on the ui thread). Far more common are the network I/O operations that should strictly happen in a background thread.
The most common usage of enums is probably some kind of type check - whether an object is this or that which is so fast you won't be able to notice a difference between a single comparison of enums and a comparison of integers.
Can someone please clarify which one is the best way to represent the same in Android?
There is no general rule of thumb for this. Use whatever works for you and helps you get your app ready. Optimize later - after you notice there's a bottleneck that slows some aspect of your app.
I like to add, that you can not use #Annotations when you declare a List<> or Map<> where either key or value is of one of your annotation interfaces.
You get the error "Annotations are not allowed here".
enum Values { One, Two, Three }
Map<String, Values> myMap; // This works
// ... but ...
public static final int ONE = 1;
public static final int TWO = 2;
public static final int THREE = 3;
#Retention(RetentionPolicy.SOURCE)
#IntDef({ONE, TWO, THREE})
public #interface Values {}
Map<String, #Values Integer> myMap; // *** ERROR ***
So when you need to pack it into a list/map, use enum, as they can be added, but #annotated int/string groups can not.
Two facts.
1, Enum is one of the most powerful feature in JAVA.
2, Android phone usually has a LOT of memory.
So my answer is NO. I will use Enum in Android.

Do You Cache Properties in Local Variables?

Consider the class Foo.
public class Foo {
private double size;
public double getSize() {
return this.size; // Always O(1)
}
}
Foo has a property called size, which is frequently accessed, but never modified, by a given method. I've always cached a property in a variable whenever it is accessed more than once in any method, because "someone told me so" without giving it much thought. i.e.
public void test(Foo foo) {
double size = foo.getSize(); // Cache it or not?
// size will be referenced in several places later on.
}
Is this worth it, or an overkill?
If I don't cache it, are modern compilers smart enough to cache it themselves?
A couple of factors (in no particular order) that I consider when deciding whether or not to store the value returned by a call to a "get() method":
Performance of the get() method - Unless the API specifies, or unless the calling code is tightly coupled with the called method, there are no guarantees of the performance of the get() method. The code may be fine in testing now, but may get worse if the get() methods performace changes in the future or if testing does not reflect real-world conditions. (e.g. testing with only a thousand objects in a container when a real-world container might have ten million) Used in a for-loop, the get() method will be called before every iteration
Readability - A variable can be given a specific and descriptive name, providing clarification of its use and/or meaning in a way that may not be clear from inline calls to the get() method. Don't underestimate the value of this to those reviewing and maintaining the code.
Thread safety - Can the value returned by the get() method potentially change if another thread modifies the object while the calling method is doing its thing? Should such a change be reflected in the calling method's behavior?
Regarding the question of whether or not compilers will cache it themselves, I'm going to speculate and say that in most cases the answer has to be 'no'. The only way the compiler could safely do so would be if it could determine that the get() method would return the same value at every invocation. And this could only be guaranteed if the get() method itself was marked final and all it did was return a constant (i.e an object or primitive also marked 'final'). I'm not sure but I think this is probably not a scenario the compiler bothers with. The JIT compiler has more information and thus could have more flexibility but you have no guarantees that some method will get JIT'ed.
In conclusion, don't worry about what the compiler might do. Caching the return value of a get() method is probably the right thing to do most of the time, and will rarely (i.e almost never) be the wrong thing to do. Favor writing code that is readable and correct over code that is fast(est) and flashy.
I don't know whether there is a "right" answer, but I would keep a local copy.
In your example, I can see that getSize() is trivial, but in real code, I don't always know whether it is trivial or not; and even if it is trivial today, I don't know that somebody won't come along and change the getSize() method to make it non-trivial sometime in the future.
The biggest factor would be performance. If it's a simple operation that doesn't require a whole lot of CPU cycles, I'd say don't cache it. But if you constantly need to execute an expensive operation on data that doesn't change, then definitely cache it. For example, in my app the currently logged in user is serialized on every page in JSON format, the serialization operation is pretty expensive, so in order to improve performance I now serialize the user once when he signs in and then use the serialized version for putting JSON on the page. Here is before and after, made a noticeable improvement in performance:
//Before
public User(Principal principal) {
super(principal.getUsername(), principal.getPassword(), principal.getAuthorities());
uuid = principal.getUuid();
id = principal.getId();
name = principal.getName();
isGymAdmin = hasAnyRole(Role.ROLE_ADMIN);
isCustomBranding= hasAnyRole(Role.ROLE_CUSTOM_BRANDING);
locations.addAll(principal.getLocations());
}
public String toJson() {
**return JSONAdapter.getGenericSerializer().serialize(this);**
}
// After
public User(Principal principal) {
super(principal.getUsername(), principal.getPassword(), principal.getAuthorities());
uuid = principal.getUuid();
id = principal.getId();
name = principal.getName();
isGymAdmin = hasAnyRole(Role.ROLE_ADMIN);
isCustomBranding= hasAnyRole(Role.ROLE_CUSTOM_BRANDING);
locations.addAll(principal.getLocations());
**json = JSONAdapter.getGenericSerializer().serialize(this);**
}
public String toJson() {
return json;
}
The User object has no setter methods, there is no way the data would ever change unless the user signs out and then back in, so in this case I'd say it is safe to cache the value.
If the value of size was calculated each time say by looping through an array and thus not O(1), caching the value would have obvious benefits performance-wise. However since size of Foo is not expected to change at any point and it is O(1), caching the value mainly aids in readability. I recommend continuing to cache the value simply because readability is often times more of a concern than performance in modern computing systems.
IMO, if you are really worried about performance this is a bit overkill or extensive but there is a couple of ways to ensure that the variable is "cached" by your VM,
First, you can create final static variables of the results (as per your example 1 or 0), hence only one copy is stored for the whole class, then your local variable is only a boolean (using only 1 bit), but still maintaining the result value of double (also, maybe you can use int, if it is only 0 or 1)
private static final double D_ZERO = 0.0;
private static final double D_ONE = 1.0;
private boolean ZERO = false;
public double getSize(){
return (ZERO ? D_ZERO : D_ONE);
}
Or if you are able to set the size on initialization of the class you can go with this, you can set the final variable through constructor, and static, but since this is a local variable you can go with the constructor:
private final int SIZE;
public foo(){
SIZE = 0;
}
public double getSize(){
return this.SIZE;
}
this can be accessed via foo.getSize()
In my code, i would cache it if either the getSize() method is time consuming or - and that is more often - the result is used in more or less complex expressions.
For example if calculating an offset from the size
int offset = fooSize * count1 + fooSize * count2;
is easier to read (for me) than
int offset = foo.getSize() * count1 + foo.getSize() * count2;

What's the best way to handle coexistence of the "int enum" pattern with java enums as an API evolves?

Suppose you're maintaining an API that was originally released years ago (before java gained enum support) and it defines a class with enumeration values as ints:
public class VitaminType {
public static final int RETINOL = 0;
public static final int THIAMIN = 1;
public static final int RIBOFLAVIN = 2;
}
Over the years the API has evolved and gained Java 5-specific features (generified interfaces, etc). Now you're about to add a new enumeration:
public enum NutrientType {
AMINO_ACID, SATURATED_FAT, UNSATURATED_FAT, CARBOHYDRATE;
}
The 'old style' int-enum pattern has no type safety, no possibility of adding behaviour or data, etc, but it's published and in use. I'm concerned that mixing two styles of enumeration is inconsistent for users of the API.
I see three possible approaches:
Give up and define the new enum (NutrientType in my fictitious example) as a series of ints like the VitaminType class. You get consistency but you're not taking advantage of type safety and other modern features.
Decide to live with an inconsistency in a published API: keep VitaminType around as is, and add NutrientType as an enum. Methods that take a VitaminType are still declared as taking an int, methods that take a NutrientType are declared as taking such.
Deprecate the VitaminType class and introduce a new VitaminType2 enum. Define the new NutrientType as an enum. Congratulations, for the next 2-3 years until you can kill the deprecated type, you're going to deal with deprecated versions of every single method that took a VitaminType as an int and adding a new foo(VitaminType2 v) version of each. You also need to write tests for each deprecated foo(int v) method as well as its corresponding foo(VitaminType2 v) method, so you just multiplied your QA effort.
What is the best approach?
How likely is it that the API consumers are going to confuse VitaminType with NutrientType? If it is unlikely, then maybe it is better to maintain API design consistency, especially if the user base is established and you want to minimize the delta of work/learning required by customers. If confusion is likely, then NutrientType should probably become an enum.
This needn't be a wholesale overnight change; for example, you could expose the old int values via the enum:
public enum Vitamin {
RETINOL(0), THIAMIN(1), RIBOFLAVIN(2);
private final int intValue;
Vitamin(int n) {
intValue = n;
}
public int getVitaminType() {
return intValue;
}
public static Vitamin asVitamin(int intValue) {
for (Vitamin vitamin : Vitamin.values()) {
if (intValue == vitamin.getVitaminType()) {
return vitamin;
}
}
throw new IllegalArgumentException();
}
}
/** Use foo.Vitamin instead */
#Deprecated
public class VitaminType {
public static final int RETINOL = Vitamin.RETINOL.getVitaminType();
public static final int THIAMIN = Vitamin.THIAMIN.getVitaminType();
public static final int RIBOFLAVIN = Vitamin.RIBOFLAVIN.getVitaminType();
}
This allows you to update the API and gives you some control over when to deprecate the old type and scheduling the switch-over in any code that relies on the old type internally.
Some care is required to keep the literal values in sync with those that may have been in-lined with old consumer code.
Personal opinion is that it's probably not worth the effort of trying to convert. For one thing, the "public static final int" idiom isn't going away any time soon, given that it's sprinkled liberally all over the JDK. For another, tracking down usages of the original ints is likely to be really unpleasant, given that your classes will compile away the reference so you're likely not to know you've broken anything until it's too late
(by which I mean
class A
{
public static final int MY_CONSTANT=1
}
class B
{
....
i+=A.MY_CONSTANT;
}
gets compiled into
i+=1
So if you rewrite A you may not ever realize that B is broken until you recompile B later.
It's a pretty well known idiom, probably not so terrible to leave it in, certainly better than the alternative.
There is a rumor that the creator of "make" realized that the syntax of Makefiles was bad, but felt that he couldn't change it because he already had 10 users.
Backwards compatibility at all costs, even if it hurts your customers, is a bad thing. SO can't really give you a definitive answer on what to do in your case, but be sure and consider the cost to your users over the long term.
Also think about ways you can refactor the core of your code will keeping the old integer based enums only at the outer layer.
Wait for the next major revision, change everything to enum and provide a script (sed, perl, Java, Groovy, ...) to convert existing source code to use the new syntax.
Obviously this has two drawbacks:
No binary compatibility. How important this one is depends on the use cases, but can be acceptable in the case of a new major release
Users have to do some work. If the work is simple enough, then this too may be acceptable.
In the meantime, add new types as enums and keep old types as ints.
The best would be if you could just fix the published versions, if possible. In my opinion consistency would be the best solution, so you would need to do some refactoring. I personally don't like deprecated things, because they get into way. You might be able to wait until a bigger version release and use those ints until then, and refactor everything in a big project. If that is not possible, you might consider yourself stuck with the ints, unless you create some kinds of wrappers or something.
If nothing helps but you still evolve the code, you end up losing consistency or living with the deprecated versions. In any case, usually at least at some point of time people become fed up with old stuff if it has lost it's consistency and create new from scratch... So you would have the refactoring in the future no matter what.
The customer might scrap the project and buy an other product, if something goes wrong. Usually it is not the customer's problem can you afford refactoring or not, they just buy what is appropriate and usable to them. So in the end it is a tricky problem and care needs to be taken.

Categories