Why are people so emphatic about making every variable within a class "final"? I don't believe that there is any true benefit to adding final to private local variables, or really to use final for anything other than constants and passing variables into anonymous inner classes.
I'm not looking to start any sort of flame war, I just honestly want to know why this is so important to some people. Am I missing something?
Intent. Other people modifying your code won't change values they aren't supposed to change.
Compiler optimizations can be made if the compiler knows a field's value will never change.
Also, if EVERY variable in a class is final (as you refer to in your post), then you have an immutable class (as long as you don't expose references to mutable properties) which is an excellent way to achieve thread-safety.
The downside, is that
annoy it is hard
annoy to read
annoy code or anything
annoy else when it all
annoy starts in the
annoy same way
Other than the obvious usage for creating constants and preventing subclassing/overriding, it is a personal preference in most cases since many believe the benefits of "showing programmer intent" are outweighed by the actual code readability. Many prefer a little less verbosity.
As for optimisations, that is a poor reason for using it (meaningless in many cases). It is the worst form of micro optimisation and in the days of JIT serves no purpose.
I would suggest to use it if you prefer, don't if you that is what you prefer. Since it will all come down to religious arguments in many cases, don't worry about it.
It marks that I'm not expecting that value to change, which is free documentation. The practice is because it clearly communicates the intent of that variable and forces the compiler to verify that. Beyond that, it allows the compiler to make optimizations.
It's important because immutability is important particularly when dealing with a shared memory model. If something is immutable then it's thread safe, that makes it good enough an argument to follow as a best practice.
http://www.artima.com/intv/blochP.html
One benefit for concurrent programming which hasn't been mentioned yet:
Final fields are guaranteed to be initialized when the execution of the constructor is completed.
A project I'm currently working on is setup in a way that whenever one presses "save" in Eclipse, the final modifier is added to every variable or field that is not changed in the code. And it hasn't yet hurt anybody.
There are many good reasons to use final, as noted elsewhere. One place where it is not worth it, IMO, is on parameters to a method. Strictly speaking, the keyword adds value here, but the value is not high enough to withstand the ugly syntax. I'd prefer to express that kind of information through unit tests.
I think use of final over values that are inner to a class is an overkill unless the class is probably going to be inherited. The only advantage is around the compiler optimizations, which surely may benefit.
Related
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
When should one use final?
I tend to declare all variables final unless necessary. I consider this to be a good practice because it allows the compiler to check that the identifier is used as I expect (e.g. it is not mutated). On the other hand it clutters up the code, and perhaps this is not "the Java way".
I am wondering if there is a generally accepted best practice regarding the non-required use of final variables, and if there are other tradeoffs or aspects to this discussion that should be made aware of.
The "Java way" is inherently, intrinsically cluttery.
I say it's a good practice, but not one I follow.
Tests generally ensure I'm doing what I intend, and it's too cluttery for my aesthetics.
Some projects routinely apply final to all effectively final local variables. I personally find reading such code much easier, due to the lessened cognitive load. A non-final variable could be reassigned anywhere, and it's especially problematic in code with multiple levels of nested ifs and fors. You never know what code path may have reassigned it.
As for the concern of code clutter, when applied to local variables I don't find it damaging—in fact it makes me spot all declarations more easily due to syntax coloring.
Unfortunately, when final is used on parameters, catch blocks, enhanced-for loops and all other places except local variables in the narrow sense, the code does tend to become cluttered. This is quite unfortunate because a reassignment in these cases is even more confusing and they should really have been final by default.
There are code linting tools that will flag the reassignment of these variables, and that helps.
I consider it good practice, more for maintenance programmers (including me!) than for the compiler. It's easier to think about a method if I don't need to worry about which variables might be changing inside it.
Yes, it's a very good idea, because it clearly shows what fields must be provided at object construction.
I strongly disagree that it creates "code clutter"; it's a good and powerful aspect of the language.
As a design principle, you should make your classes immutable (all final fields) if you can, because they may be safely published (ie freely passed around without fear they will be corrupted). Although note that the fields themselves need to be immutable objects too.
It definitely gives you a better code, easy to see which all variables are going to be changed.
It also informs the compiler that it is not going to change which can result to better optimization.
Along side it allows your IDE to give you compile time notification if you tend to do any mistake.
Some good analysis tools, like PMD, advices to put always final unless necessary. So the convention in that tools says it's a good practice
But I think that so many final tokens in code may get it less human-friendly.
I would say yes, not so much for the compiler optimisation, but rather for readibility.
But personally I don't use it. Java is quite verbose by itself, and if we followed everything considered "good practice", the code would be unredable from all the boilerplate. It's a matter of preference, though.
You pretty much summed up the pros and cons...
I can just add another con:
the reader of the code need not to reason at all about the value of a final variable (except for rare bad-code cases).
So, yes, it's a good practice.
And the clutter isn't that bad, after you get used to it (like unix :-P). Plus, typical IDEs do it automatically for ya...
I was reading the ArrayBlockingQueue implementation code another day by Doug Lea and noticed a lot of methods (public, default, and private) have the following references:
final Object[] items = this.items;
final ReentrantLock lock = this.lock;
I have asked around to have a reasonable explanation but so far no satisfactory answers. I am not quite sure why we need to have such local variables in the first place? And what is the benefit(s) of coding this way?
Maybe I missed some important points in concurrent programming. Could you please help to shed some lights on this?
A very good reason for a method to set a local variable to the value of an accessible class or instance variable, or a value accessible through one of those, is to thereafter be independent of any modifications to that variable by other threads. With some caveats, this allows the method that needs to access that value more than once to perform a consistent computation reflecting the state of the host object at some specific point in time, even if that state has changed by the time the result is returned. This is likely what's happening in the code you have been studying.
It happened that I just came across this link which explained some of the main arguments of coding this way:[In ArrayBlockingQueue, why copy final member field into local final variable?. Please read it to understand more, instead, I am hoping of not getting more confused. I believe it helps you to look at this practice from another angle. It seems it at least met some of my curiosities around this coding style.
After going through all relevant threads on the coding practice of assigning a final class variable to a local copy, i.e. a final class variable is never accessed directly from within a method, instead it is always referenced by a local variable reference:
final Object[] items = this.items;
final ReentrantLock lock = this.lock;
Typically you will find this code style in ArrayBlockingQueue and other concurrent classes
The following are my findings:
It is an idiomatic use, made popular by Doug Lea, the main author of
the core Java library on multithreading/concurrency classes
The main consideration of this coding practice (or rather a hack) is for a small
performance optimization back in Java 5 era
It is arguable if such a trick can have a performance gain; Some argued it is opposite
with modern compiler; Others believe it is not needed
So my takings are that we should not be encouraged to adopt this practice. Because in many applications you don’t need it. Clean code maybe more important than a small performance gain; let alone no one is 100% certain whether this (a performance gain) is the case anymore.
If I look at the java source source code in the OpenJDK or Hibernate or Apache I have yet to see any local variables declared final.
This suggests that the developers of some of the most widely used java software libraries:
do not believe the final keyword improves readablity.
do not believe it significantly improves performance.
Why do the majority of contrbuters on stackoverflow believe it it should be used (based on the highest voted responses)?
Probably because it's a hassle to type in the five LONG letters in the word final... why would they go through the pain of writing
final int x;
when it's twice as much typing as
int x;
?
We developers are lazy, you know... :P
do not believe the final keyword
improves readablity.
Some people (such as myself) find excessive finals decreases readability.
do not believe it significantly
improves performance.
final local variables does not improve performance.
As far as I'm aware, the final keyword has no impact on the runtime performance of your variables.
I believe it's primary purpose is to assist you in the catching of bugs. If you know something is never going to change, you mark it as such. Similar to why we use annotations where we can, any time we can trade a runtime bug for a compile time error, we do. Finding an error when you're working on it, and it's fresh in your mind, and it hasn't gone and corrupted someone's data causing you to lose customers, yeah that's a very good thing. You get the compile error, you fix it, you move on, you don't break the nightly build, yeah those are good things.
The final keyword has two uses:
declare a class or method as final in order to prevent subclassing/overrding
declare a variable as final in order to prevent changing it (assigning a new value)
Case 2 is normally applied to member variables in order to make the object immutable (at least partly) or to method parameters in order to prevent accidential assignments.
In case of a local variable (i.e. method scoped and not a parameter), that's normally not necessary or wanted, since those variables are likely to be changed within the method (otherwise you might not need them, except to cache a reference for method scope).
I doubt declaring a local variable final ever improves performance. By virtue of the existence of final, Java compilers are already required to be able to tell if a variable might be assigned more than once, or might not be initialized. Therefore, actually declaring a local as final doesn't tell the compiler anything it didn't already know--it's only for the benefit of the reader.
Now whether it sometimes improves readability, that's more subjective. In a complicated piece of code it can be nice to promise (to yourself, or to future readers) that a variable is only written once. But it might be nicer to simplify the code so that is readily apparent anyway.
Before I ask my question can I please ask not to get a lecture about optimising for no reason.
Consider the following questions purely academic.
I've been thinking about the efficiency of accesses between root (ie often used and often accessing each other) classes in Java, but this applies to most OO languages/compilers. The fastest way (I'm guessing) that you could access something in Java would be a static final reference. Theoretically, since that reference is available during loading, a good JIT compiler would remove the need to do any reference lookup to access the variable and point any accesses to that variable straight to a constant address. Perhaps for security reasons it doesn't work that way anyway, but bear with me...
Say I've decided that there are some order of operations problems or some arguments to pass at startup that means I can't have a static final reference, even if I were to go to the trouble of having each class construct the other as is recommended to get Java classes to have static final references to each other. Another reason I might not want to do this would be... oh, say, just for example, that I was providing platform specific implementations of some of these classes. ;-)
Now I'm left with two obvious choices. I can have my classes know about each other with a static reference (on some system hub class), which is set after constructing all classes (during which I mandate that they cannot access each other yet, thus doing away with order of operations problems at least during construction). On the other hand, the classes could have instance final references to each other, were I now to decide that sorting out the order of operations was important or could be made the responsibility of the person passing the args - or more to the point, providing platform specific implementations of these classes we want to have referencing each other.
A static variable means you don't have to look up the location of the variable wrt to the class it belongs to, saving you one operation. A final variable means you don't have to look up the value at all but it does have to belong to your class, so you save 'one operation'. OK I know I'm really handwaving now!
Then something else occurred to me: I could have static final stub classes, kind of like a wacky interface where each call was relegated to an 'impl' which can just extend the stub. The performance hit then would be the double function call required to run the functions and possibly I guess you can't declare your methods final anymore. I hypothesised that perhaps those could be inlined if they were appropriately declared, then gave up as I realised I would then have to think about whether or not the references to the 'impl's could be made static, or final, or...
So which of the three would turn out fastest? :-)
Any other thoughts on lowering frequent-access overheads or even other ways of hinting performance to the JIT compiler?
UPDATE: After running several hours of test of various things and reading http://www.ibm.com/developerworks/java/library/j-jtp02225.html I've found that most things you would normally look at when tuning e.g. C++ go out the window completely with the JIT compiler. I've seen it run 30 seconds of calculations once, twice, and on the third (and subsequent) runs decide "Hey, you aren't reading the result of that calculation, so I'm not running it!".
FWIW you can test data structures and I was able to develop an arraylist implementation that was more performant for my needs using a microbenchmark. The access patterns must have been random enough to keep the compiler guessing, but it still worked out how to better implement a generic-ified growing array with my simpler and more tuned code.
As far as the test here was concerned, I simply could not get a benchmark result! My simple test of calling a function and reading a variable from a final vs non-final object reference revealed more about the JIT than the JVM's access patterns. Unbelievably, calling the same function on the same object at different places in the method changes the time taken by a factor of FOUR!
As the guy in the IBM article says, the only way to test an optimisation is in-situ.
Thanks to everyone who pointed me along the way.
Its worth noting that static fields are stored in a special per-class object which contains the static fields for that class. Using static fields instead of object fields are unlikely to be any faster.
See the update, I answered my own question by doing some benchmarking, and found that there are far greater gains in unexpected areas and that performance for simple operations like referencing members is comparable on most modern systems where performance is limited more by memory bandwidth than CPU cycles.
Assuming you found a way to reliably profile your application, keep in mind that it will all go out the window should you switch to another jdk impl (IBM to Sun to OpenJDK etc), or even upgrade version on your existing JVM.
The reason you are having trouble, and would likely have different results with different JVM impls lies in the Java spec - is explicitly states that it does not define optimizations and leaves it to each implementation to optimize (or not) in any way so long as execution behavior is unchanged by the optimization.
DataflowAnomalyAnalysis: Found
'DD'-anomaly for variable 'variable'
(lines 'n1'-'n2').
DataflowAnomalyAnalysis: Found
'DU'-anomaly for variable 'variable'
(lines 'n1'-'n2').
DD and DU sound familiar...I want to say in things like testing and analysis relating to weakest pre and post conditions, but I don't remember the specifics.
NullAssignment: Assigning an Object to
null is a code smell. Consider
refactoring.
Wouldn't setting an object to null assist in garbage collection, if the object is a local object (not used outside of the method)? Or is that a myth?
MethodArgumentCouldBeFinal: Parameter
'param' is not assigned and could be
declared final
LocalVariableCouldBeFinal: Local
variable 'variable' could be declared
final
Are there any advantages to using final parameters and variables?
LooseCoupling: Avoid using
implementation types like
'LinkedList'; use the interface
instead
If I know that I specifically need a LinkedList, why would I not use one to make my intentions explicitly clear to future developers? It's one thing to return the class that's highest up the class path that makes sense, but why would I not declare my variables to be of the strictest sense?
AvoidSynchronizedAtMethodLevel: Use
block level rather than method level
synchronization
What advantages does block-level synchronization have over method-level synchronization?
AvoidUsingShortType: Do not use the
short type
My first languages were C and C++, but in the Java world, why should I not use the type that best describes my data?
DD and DU anomalies (if I remember correctly—I use FindBugs and the messages are a little different) refer to assigning a value to a local variable that is never read, usually because it is reassigned another value before ever being read. A typical case would be initializing some variable with null when it is declared. Don't declare the variable until it's needed.
Assigning null to a local variable in order to "assist" the garbage collector is a myth. PMD is letting you know this is just counter-productive clutter.
Specifying final on a local variable should be very useful to an optimizer, but I don't have any concrete examples of current JITs taking advantage of this hint. I have found it useful in reasoning about the correctness of my own code.
Specifying interfaces in terms of… well, interfaces is a great design practice. You can easily change implementations of the collection without impacting the caller at all. That's what interfaces are all about.
I can't think of many cases where a caller would require a LinkedList, since it doesn't expose any API that isn't declared by some interface. If the client relies on that API, it's available through the correct interface.
Block level synchronization allows the critical section to be smaller, which allows as much work to be done concurrently as possible. Perhaps more importantly, it allows the use of a lock object that is privately controlled by the enclosing object. This way, you can guarantee that no deadlock can occur. Using the instance itself as a lock, anyone can synchronize on it incorrectly, causing deadlock.
Operands of type short are promoted to int in any operations. This rule is letting you know that this promotion is occurring, and you might as well use an int. However, using the short type can save memory, so if it is an instance member, I'd probably ignore that rule.
DataflowAnomalyAnalysis: Found
'DD'-anomaly for variable 'variable'
(lines 'n1'-'n2').
DataflowAnomalyAnalysis: Found
'DU'-anomaly for variable 'variable'
(lines 'n1'-'n2').
No idea.
NullAssignment: Assigning an Object to
null is a code smell. Consider
refactoring.
Wouldn't setting an object to null assist in garbage collection, if the object is a local object (not used outside of the method)? Or is that a myth?
Objects in local methods are marked to be garbage collected once the method returns. Setting them to null won't do any difference.
Since it would make less experience developers what is that null assignment all about it may be considered a code smell.
MethodArgumentCouldBeFinal: Parameter
'param' is not assigned and could be
declared final
LocalVariableCouldBeFinal: Local
variable 'variable' could be declared
final
Are there any advantages to using final parameters and variables?
It make clearer that the value won't change during the lifecycle of the object.
Also, if by any chance someone try to assign a value, the compiler will prevent this coding error at compile type.
consider this:
public void businessRule( SomeImportantArgument important ) {
if( important.xyz() ){
doXyz();
}
// some fuzzy logic here
important = new NotSoImportant();
// add for/if's/while etc
if( important.abc() ){ // <-- bug
burnTheHouse();
}
}
Suppose that you're assigned to solve some mysterious bug that from time to time burns the house.
You know what wast the parameter used, what you don't understand is WHY the burnTHeHouse method is invoked if the conditions are not met ( according to your findings )
It make take you a while to findout that at some point in the middle, somone change the reference, and that you are using other object.
Using final help to prevent this kind of things.
LooseCoupling: Avoid using
implementation types like
'LinkedList'; use the interface
instead
If I know that I specifically need a LinkedList, why would I not use one to make my intentions explicitly clear to future developers? It's one thing to return the class that's highest up the class path that makes sense, but why would I not declare my variables to be of the strictest sense?
There is no difference, in this case. I would think that since you are not using LinkedList specific functionality the suggestion is fair.
Today, LinkedList could make sense, but by using an interface you help your self ( or others ) to change it easily when it wont.
For small, personal projects this may not make sense at all, but since you're using an analyzer already, I guess you care about the code quality already.
Also, helps less experienced developer to create good habits. [ I'm not saying you're one but the analyzer does not know you ;) ]
AvoidSynchronizedAtMethodLevel: Use
block level rather than method level
synchronization
What advantages does block-level synchronization have over method-level synchronization?
The smaller the synchronized section the better. That's it.
Also, if you synchronize at the method level you'll block the whole object. When you synchronize at block level, you just synchronize that specific section, in some situations that's what you need.
AvoidUsingShortType: Do not use the
short type
My first languages were C and C++, but in the Java world, why should I not use the type that best describes my data?
I've never heard of this, and I agree with you :) I've never use short though.
My guess is that by not using it, you'll been helping your self to upgrade to int seamlessly.
Code smells are more oriented to code quality than performance optimizations. So the advice are given for less experienced programmers and to avoid pitfalls, than to improve program speed.
This way, you could save a lot of time and frustrations when trying to change the code to fit a better design.
If it the advise doesn't make sense, just ignore them, remember, you are the developer at charge, and the tool is just that a tool. If something goes wrong, you can't blame the tool, right?
Just a note on the final question.
Putting "final" on a variable results in it only be assignable once. This does not necessarily mean that it is easier to write, but it most certainly means that it is easier to read for a future maintainer.
Please consider these points:
any variable with a final can be immediately classified in "will not change value while watching".
by implication it means that if all variables which will not change are marked with final, then the variables NOT marked with final actually WILL change.
This means that you can see already when reading through the definition part which variables to look out for, as they may change value during the code, and the maintainer can spend his/her efforts better as the code is more readable.
Wouldn't setting an object to null
assist in garbage collection, if the
object is a local object (not used
outside of the method)? Or is that a
myth?
The only thing it does is make it possible for the object to be GCd before the method's end, which is rarely ever necessary.
Are there any advantages to using final parameters and variables?
It makes the code somewhat clearer since you don't have to worry about the value being changed somwhere when you analyze the code. More often then not you don't need or want to change a variable's value once it's set anyway.
If I know that I specifically need a
LinkedList, why would I not use one to
make my intentions explicitly clear to
future developers?
Can you think of any reason why you would specifically need a
LinkedList?
It's one thing to
return the class that's highest up the
class path that makes sense, but why
would I not declare my variables to be
of the strictest sense?
I don't care much about local variables or fields, but if you declare a method parameter of type LinkedList, I will hunt you down and hurt you, because it makes it impossible for me to use things like Arrays.asList() and Collections.emptyList().
What advantages does block-level synchronization have over method-level synchronization?
The biggest one is that it enables you to use a dedicated monitor object so that only those critical sections are mutually exclusive that need to be, rather than everything using the same monitor.
in the Java world, why should I not
use the type that best describes my
data?
Because types smaller than int are automtically promoted to int for all calculations and you have to cast down to assign anything to them. This leads to cluttered code and quite a lot of confustion (especially when autoboxing is involved).
AvoidUsingShortType: Do not use the short type
List item
short is 16 bit, 2's compliment in java
a short mathmatical operaion with anything in the Integer family outside of another short will require a runtime sign extension conversion to the larger size. operating against a floating point requires sign extension and a non-trivial conversion to IEEE-754.
can't find proof, but with a 32 bit or 64 bit register, you're no longer saving on 'processor instructions' at the bytecode level. You're parking a compact car in a a semi-trailer's parking spot as far as the processor register is concerned.
If your are optimizing your project at the byte code level, wow. just wow. ;P
I agree on the design side of ignoring this pmd warning, just weigh accurately describing your object with a 'short' versus the incurred performance conversions.
in my opinion, the incurred performance hits are miniscule on most machines. ignore the error.
What advantages does block-level
synchronization have over method-level
synchronization?
Synchronize a method is like do a synchronize(getClass()) block, and blocks all the class.
Maybe you don't want that