I've read (e.g. from Martin Fowler) that we should use guard clause instead of single return in a (short) method in OOP. I've also read (from somewhere I don't remember) that else clause should be avoided when possible.
But my colleagues (I work in a small team with only 3 guys) force me not to use multiple returns in a method, and to use else clause as much as possible, even if there is only one comment line in the else block.
This makes it difficult for me to follow their coding style, because for example, I cannot view all code of a method in one screen. And when I code, I have to write guard clause first, and then try to convert it into the form with out multiple returns.
Am I wrong or what should I do with it?
This is arguable and pure aesthetic question.
Early return has been historically avoided in C and similar languages since it was possible to miss resource cleanup which is usually placed at the end of the function in case of early return.
Given that Java have exceptions and try, catch, finally, there's no need to fear early returns.
Personaly, I agree with you, since I do early return often - that usually means less code and simpler code flow with less if/else nesting.
Guard clause is a good idea because it clearly indicates that current method is not interested in certain cases. When you clear up at the very beginning of the method that it doesn't deal with some cases (e.g. when some value is less than zero), then the rest of the method is pure implementation of its responsibility.
There is one stronger case of guard clauses - statements that validate input and throw exceptions when some argument is unacceptable, e.g. null. In that case you don't want to proceed with execution but wish to throw at the very beginning of the method. That is where guard clauses are the best solution because you don't want to mix exception throwing logic with the core of the method you're implementing.
When talking about guard clauses that throw exceptions, here is one article about how you can simplify them in C# using extension methods: How to Reduce Cyclomatic Complexity: Guard Clause. Though that method is not available in Java, it is useful in C#.
Have them read http://www.cis.temple.edu/~ingargio/cis71/software/roberts/documents/loopexit.txt and see if it will change their minds. (There is history to their idea, but I side with you.)
Edit: Here are the critical points from the article. The principle of single exits from control structures was adopted on principle, not observational data. But observational data says that allowing multiple ways of exiting control structures makes certain problems easier to solve accurately, and does not hurt readability. Disallowing it makes code harder and more likely to be buggy. This holds across a wide variety of programmers, from students to textbook writers. Therefore we should allow and use multiple exits where appropriate.
I'm in the multiple-return/return-early camp and I would lobby to convince other engineers of this. You can have great arguments and cite great sources, but in the end, all you can do is make your pitch, suggest compromises, come to a decision, and then work as a team, which ever way it works out. (Although revisiting of the topic from time to time isn't out of the question either.)
This really just comes down to style and, in the grand scheme of things, a relatively minor one. Overall, you're a more effective developer if you can adapt to either style. If this really "makes it difficult ... to follow their coding style", then I suggest you work on it, because in the end, you'll end up the better engineer.
I had an engineer once come to me and insist he be given dispensation to follow his own coding style (and we had a pretty minimal set of guidelines). He said the established coding style hurt his eyes and made it difficult for him to concentrate (I think he may have even said "nauseous".) I told him that if he was going to work on a lot of people's code, not just code he wrote, and vice versa. If he couldn't adapt to work with the agreed upon style, I couldn't use him and that maybe this type of collaborative project wasn't the right place for him. Coincidentally, it was less of an issue after that (although every code review was still a battle).
My issue with guard clauses is that 1) they can be easily dispersed through code and be easy to miss (this has happened to me on multiple occasions) and 2) I have to remember which code has been "ejected" as I trace code blocks which can become complex and 3) by setting code within if/else you have a contained set of code that you know executes for a given set of criteria. With guard conditions, the criteria is EVERYTHING minus what the guard has ejected. It is much more difficult for me to get my head around that.
Related
So this may sound simple, but I have a method that has a for loop inside, inside the forloop the method createprints needs a map of "parameters" which it gets from getParameters, now there are 2 types of reports, one has a general set of parameters and another has that general set and a set of its own.
I have two options:
Either have 2 getparameters method, one that is general and another that is for rp2 but also calls the general parameters method. If I do this then it would make sense to add the conditional before the for loop like this :
theMethod(){
if (rp1){
for loop{
createPrints(getgenParameters())
do general forloop stuff
}
}else{
for loop{
createPrints(getParameters())
do general forloop stuff
}
}
}
This way it only checks once which parameters method to call instead of having the if statement inside the loop so that it checks every iteration (this is bad because the report type will never change throughout the loop) but then this way, repeating the for loop looks ugly and not clean at all, is there a cleaner way to design this?
The other option is to pass in a boolean to the get parameters method and basically you check inside which type of report it is and based on that you create the map, this however adds a conditional at each iteration as well.
From a performance perspective, it makes sense to have the conditional outside the loop so that its not redundantly checked at each iteration, but it doesnt look clean, and my boss really cares about how clean code looks, he didnt like me using an if else code block instead of doing it with ternary operators since ternary only uses one line (i think performance is still the same no?).
Forgot to mention I am using java, i cannot assign functions to variables or use callbacks
Inside the method, there was an if else code block before the for loop something like
String aVariable;
if(condition){
aVariable= value1;
}else{
aVariable =value2;
}
so i initially wanted to just create a boolean variable like isreport1 and inside if/else code block also assign the value cause it was using the same condition. And then as mentioned before pass in the parameter but, my boss again said not use booleans in parameters, so is this case the same I shouldnt do it here?
The branch prediction is a property of the CPU, not of the compiler. All modern CPUs have it, don't worry.
Most compilers can pull a constant condition out of the loop, so don't worry again. Java does it, too. Details: The javac compiler does not do it, it's job is to transform source cod to byte code and nothing else. But at runtime, the time-critical parts of the byte code get compiled in the machine and there many optimizations happen.
Forgot to mention I am using java, i cannot assign functions to variables or use callbacks.
You sort of can. It's implemented via anonymous classes and since Java 8, there are lambdas. Surely worth learning, but not relevant to your case.
I'd simply go for
for loop{
createPrints(rp1 ? getgenParameters() : getParameters())
do general forloop stuff
}
as it's the shortest and cleanest way for doing it.
There are tons of alternatives like defining Parameters getSomeParameters(boolean rp1), which you can create easily using "extract method" from the above ternary.
From a performance perspective, it makes sense to have the conditional outside the loop so that its not redundantly checked at each iteration, but it doesn't look clean, and my boss really cares about how clean code looks
From a performance perspective, it all doesn't matter. The compiler is damn smart and knows tons of optimizations. Just write clean code with short methods, so it can do its job properly.
he didnt like me using an if else code block instead of doing it with ternary operators since ternary only uses one line
A simple ternary one-liner makes the code much easier to understand, so you should go for it (complicated ternaries may be hard to grasp, but that's not the case here).
(i think performance is still the same no?).
Definitely. Note that
most program parts are irrelevant for the performance (Pareto principle)
low level optimizations rarely matter (the compiler knows them better)
clean code using proper data structures matters
Most "debates" about whether one way of writing some chunk of code is more efficient than another are missing the point.
The JIT compiler performs a lot of optimizations on your code behinds the scenes. It (or more precisely the people who wrote it) know a lot more about how to optimize than you do. And they will have the advantage of having done extensive benchmarking and (machine level) code walk-throughs and the like to figure out the best way to optimize.
The compiler can also optimize more reliably than a human can. And its developers are constantly improving it.
And the compiler does not need to be paid a salary to optimize code. Yes. Every hour you expend on optimizing is time that you could be spending on something else.
So the best strategy for you when optimizing is as follows:
First get the code implemented. Time spent on optimizing while coding is probably wasted. That is classic "premature optimization" behavior. Your intuition is probably not good enough to make the right call at this stage. (Few people are genuinely smart enough ...)
Next get the code working and passing its tests. Optimizing buggy code is missing the point.
Set some realistic, quantifiable goals for performance. ("As fast as possible" is neither realistic or quantifiable). Your code just needs to be fast enough to meet the business requirements. If it is already fast enough, then you are wasting developer time by optimizing it further.
Create a realistic benchmark so that you can measure your application's performance. If you can't (or don't) measure, you won't know if you have met your goals, or if a particular attempted optimization has improved things.
Use profiling to determine which parts of your code are worth the effort optimizing. Profiling tells you the hotspots where most of the time is spent. Focus there ... not on stuff that is only executed occasionally. This is where the 80-20 rule comes in.
I would talk to my boss about this. If the boss is not on board with the idea of being methodical and scientific about performance / optimization, then don't fight it. Just do what the boss says. (But when be willing to push back if you are blamed for missed deadlines that are due to time wasted on unnecessary optimization.)
There are two kinds of efficiency in Software Engineering:
The efficiency of the program
The efficiency of the programmer
These two efficiencies are in conflict.
How about this ?
theMethod(getParametersCallback){
for loop {
createPrints(getgenParametersCallback)
do general forloop stuff
}
}
. . . . .
if (rp1){
theMethod(getgenParameters)
}
else {
theMethod(getParameters)
}
I currently have a disagreement going on with my 2nd year JAVA professor that I'm hoping y'all could help solve:
The code we started with was this:
public T peek()
{
if (isEmpty())
.........
}
public boolean isEmpty()
{
return topIndex<0;
}
And she wants us to remove the isEmpty() reference and place its code directly into the if statement (i.e. change the peek method contents to:
if(topIndex<0).......) to "Make the code more efficient". I have argued that a) the runtime/compile time optimizer would most likely have inlined
the isEmpty() call, b) even if it didn't, the 5-10 machine operations would be negligible in nearly every situation, and c) its just bad style because it makes the program less readable and less changeable.
So, I guess my question is:
Is there any runtime efficiency gained by inlineing logic as opposed to just calling a method?
I have tried simple profiling techniques (aka long loop and a stopwatch) but tests have been inconclusive.
EDIT:
Thank you everyone for the responses! I appreciate you all taking the time. Also, I appreciate those of you who commented on the pragmatism of arguing with my professor and especially doing so without data. #Mike Dunlavey I appreciate your insight as a former professor and your advice on the appropriate coding sequence. #ya_pulser I especially appreciate the profiling advice and links you took the time to share.
You are correct in your assumptions about java code behaviour, but you are impolite to your professor arguing without data :). Arguing without data is pointless, prove your assumptions with measurements and graphs.
You can use JMH ( http://openjdk.java.net/projects/code-tools/jmh/ ) to create a small benchmark and measure difference between:
inlined by hand (remove isEmpty method and place the code in call place)
inlined by java jit compiler (hotspot after 100k (?) invocations - see jit print compilation output)
disabled hotspot inlining at all
Please read http://www.oracle.com/technetwork/java/whitepaper-135217.html#method
Useful parameters could be:
-Djava.compiler=NONE
-XX:+PrintCompilation
Plus each jdk version has it's own set of parameters to control jit.
If you will create some set of graphics as results of your research and will politely present them to the professor - I think it will benefit you in future.
I think that https://stackoverflow.com/users/2613885/aleksey-shipilev can help with jmh related questions.
BTW: I had great success when I inlined plenty of methods into a single huge code loop to achieve maximum speed for neural network backpropagation routine cause java is (was?) too lazy to inline methods with methods with methods. It was unmaintainable and fast :(.
Sad...
I agree with your intuitions about it, especially "the 5-10 machine operations would be negligible in nearly every situation".
I was a C.S. professor a long time ago.
On one hand, professors need all the slack you can give them.
Teaching is very demanding. You can't have a bad day.
If you show up for a class and you're not fully prepared, you're in for a rough ride.
If you give a test on Friday and don't have the grades on Monday the students will say "But you had all weekend!"
You can get satisfaction from seeing your students learn, but you yourself don't learn much, except how to teach.
On the other hand, few professors have much practical experience with real software.
So their opinions tend to be founded on various dogmatic certitudes rather than solid pragmatism.
Performance is a perfect example of this.
They tend to say "Don't do X. Do Y because it performs better." which completely misses the point about performance problems - you have to deal in fractions, not absolutes. Everything depends on what else is going on.
The way to approach performance is, as someone said "First make it right. Then make it fast."
And the way you make it fast is not by eyeballing the code (and wondering "should I do this, or should I do that"), but by running it and letting it tell you how it's spending time.
The basic idea of profiling is how you do this.
Now there is such a thing as bad profiling and good profiling, as explained in the second answer here (and usually when professors do teach profiling, they teach the bad kind), but that's the way to go.
As you say, the difference will be small, and in most circumstances the readability should be a higher priority. In this case though, since the extra method consists of a single line, I'm not sure this adds any real readability benefit unless you're calling the same method from elsewhere.
That said, remember that your lecturer's target is to help you learn computer science, and this is a different priority than writing production code. Particularly, she won't want you to be leaving optimization to automated tools, since that doesn't help your learning.
Also, just a practical note - in school and in professional development, we all have to adhere to coding standards we personally disagree with. It's an important skill, and really is necessary for team working, even if it does chafe.
Calling isEmpty is idiomatic and nicely readable.
Manually inlining that would be a micro optimization,
something best done in performance critical situations,
and after a bottleneck was confirmed by benchmarking in the intended production environment.
Is there a real performance benefit in manually inlining?
Theoretically yes, and maybe that's what the lecture wanted to emphasize.
In practice,
I don't think you'll find an absolute answer.
The automatic inlining behavior maybe implementation dependent.
Also keep in mind that benchmark results will depend on JVM implementation, version, platform.
And for that reason,
this kind of optimization can be useful in rare extreme situations,
and in general detrimental to portability and maintainability.
By the same logic, should we inline all methods,
eliminating all indirections at the expense of duplicating large blocks of code?
Definitely not.
Where you draw the line exactly between decomposition and inlining may also depend on personal taste,
to some degree.
Another form of data to look at could be the code that is generated. See the -XX:+PrintAssembly option and friends. See How to see JIT-compiled code in JVM? for more information.
I'm confident that in this particular case the Hotspot JVM will inline the call to isEmpty and there would be no performance difference.
I am writing some code to generate call graphs for a particular intermediate representation without executing it by statically scanning the IR code. The IR code itself is not too complex and I have a good understanding of what function call sequences look like so all I need to do is trace the calls. I am currently doing it the obvious way:
Keep track of where we are
If we encounter a function call, branch to that location, execute and come back
While branching put an edge between the caller and the callee
I am satisfied with where I am getting at but I want to make sure that I am not reinventing the wheel here and face corner cases. I am wondering if there are any accepted good algorithms (and/or design patterns) that do this efficiently?
UPDATE:
The IR code is a byte-code disassembly from a homebrewn Java-like language and looks like the Jasmine specification.
From an academic perspective, here are some considerations:
Do you care about being conservative / correct? For example, suppose the code you're analyzing contains a call through a function pointer. If you're just generating documentation, then it's not necessary to deal with this. If you're doing a code optimization that might go wrong, you will need to assume that 'call through pointer' means 'could be anything.'
Beware of exceptional execution paths. Your IR may or may not abstract this away from you, but keep in mind that many operations can throw both language-level exceptions as well as hardware interrupts. Again, it depends on what you want to do with the call graph later.
Consider how you'll deal with cycles (e.g. recursion, mutual recursion). This may affect how you write code for traversing the graphs later on (i.e., they will need some sort of 'visited' set to avoid traversing cycles forever).
Cheers.
Update March 6:
Based on extra information added to the original post:
Be careful about virtual method invocations. Keep in mind that, in general, it is unknowable which method will execute. You may have to assume that the call will go to any of the subclasses of a particular class. The standard example goes a bit like this: suppose you have an ArrayList<A>, and you have class B extends A. Based on a random number generator, you will add instances of A and B to the list. Now you call x.foo() for all x in the list, where foo() is a virtual method in A with an override in B. So, by just looking at the source code, there is no way of knowing whether the loop calls A.foo, B.foo, or both at run time.
I don't know the algorithm, but pycallgraph does a decent job. It is worth checking out the source for it. It is not long and should be good for checking out existing design patterns.
When I receive code I have not seen before to refactor it into some sane state, I normally fix "cosmetic" things (like converting StringTokenizers to String#split(), replacing pre-1.2 collections by newer collections, making fields final, converting C-style arrays to Java-style arrays, ...) while reading the source code I have to get familiar with.
Are there many people using this strategy (maybe it is some kind of "best practice" I don't know?) or is this considered too dangerous, and not touching old code if it is not absolutely necessary is generally prefered? Or is it more common to combine the "cosmetic cleanup" step with the more invasive "general refactoring" step?
What are the common "low-hanging fruits" when doing "cosmetic clean-up" (vs. refactoring with more invasive changes)?
In my opinion, "cosmetic cleanup" is "general refactoring." You're just changing the code to make it more understandable without changing its behavior.
I always refactor by attacking the minor changes first. The more readable you can make the code quickly, the easier it will be to do the structural changes later - especially since it helps you look for repeated code, etc.
I typically start by looking at code that is used frequently and will need to be changed often, first. (This has the biggest impact in the least time...) Variable naming is probably the easiest and safest "low hanging fruit" to attack first, followed by framework updates (collection changes, updated methods, etc). Once those are done, breaking up large methods is usually my next step, followed by other typical refactorings.
There is no right or wrong answer here, as this depends largely on circumstances.
If the code is live, working, undocumented, and contains no testing infrastructure, then I wouldn't touch it. If someone comes back in the future and wants new features, I will try to work them into the existing code while changing as little as possible.
If the code is buggy, problematic, missing features, and was written by a programmer that no longer works with the company, then I would probably redesign and rewrite the whole thing. I could always still reference that programmer's code for a specific solution to a specific problem, but it would help me reorganize everything in my mind and in source. In this situation, the whole thing is probably poorly designed and it could use a complete re-think.
For everything in between, I would take the approach you outlined. I would start by cleaning up everything cosmetically so that I can see what's going on. Then I'd start working on whatever code stood out as needing the most work. I would add documentation as I understand how it works so that I will help remember what's going on.
Ultimately, remember that if you're going to be maintaining the code now, it should be up to your standards. Where it's not, you should take the time to bring it up to your standards - whatever that takes. This will save you a lot of time, effort, and frustration down the road.
The lowest-hanging cosmetic fruit is (in Eclipse, anyway) shift-control-F. Automatic formatting is your friend.
First thing I do is trying to hide most of the things to the outside world. If the code is crappy most of the time the guy that implemented it did not know much about data hiding and alike.
So my advice, first thing to do:
Turn as many members and methods as
private as you can without breaking the
compilation.
As a second step I try to identify the interfaces. I replace the concrete classes through the interfaces in all methods of related classes. This way you decouple the classes a bit.
Further refactoring can then be done more safely and locally.
You can buy a copy of Refactoring: Improving the Design of Existing Code from Martin Fowler, you'll find a lot of things you can do during your refactoring operation.
Plus you can use tools provided by your IDE and others code analyzers such as Findbugs or PMD to detect problems in your code.
Resources :
www.refactoring.com
wikipedia - List of tools for static code analysis in java
On the same topic :
How do you refactor a large messy codebase?
Code analyzers: PMD & FindBugs
By starting with "cosmetic cleanup" you get a good overview of how messy the code is and this combined with better readability is a good beginning.
I always (yeah, right... sometimes there's something called a deadline that mess with me) start with this approach and it has served me very well so far.
You're on the right track. By doing the small fixes you'll be more familiar with the code and the bigger fixes will be easier to do with all the detritus out of the way.
Run a tool like JDepend, CheckStyle or PMD on the source. They can automatically do loads of changes that are cosemetic but based on general refactoring rules.
I do not change old code except to reformat it using the IDE. There is too much risk of introducing a bug - or removing a bug that other code now depends upon! Or introducing a dependency that didn't exist such as using the heap instead of the stack.
Beyond the IDE reformat, I don't change code that the boss hasn't asked me to change. If something is egregious, I ask the boss if I can make changes and state a case of why this is good for the company.
If the boss asks me to fix a bug in the code, I make as few changes as possible. Say the bug is in a simple for loop. I'd refactor the loop into a new method. Then I'd write a test case for that method to demonstrate I have located the bug. Then I'd fix the new method. Then I'd make sure the test cases pass.
Yeah, I'm a contractor. Contracting gives you a different point of view. I recommend it.
There is one thing you should be aware of. The code you are starting with has been TESTED and approved, and your changes automatically means that that retesting must happen as you may have inadvertently broken some behaviour elsewhere.
Besides, everybody makes errors. Every non-trivial change you make (changing StringTokenizer to split is not an automatic feature in e.g. Eclipse, so you write it yourself) is an opportunity for errors to creep in. Do you get the exact behaviour right of a conditional, or did you by mere mistake forget a !?
Hence, your changes implies retesting. That work may be quite substantial and severely overwhelm the small changes you have done.
I don't normally bother going through old code looking for problems. However, if I'm reading it, as you appear to be doing, and it makes my brain glitch, I fix it.
Common low-hanging fruits for me tend to be more about renaming classes, methods, fields etc., and writing examples of behaviour (a.k.a. unit tests) when I can't be sure of what a class is doing by inspection - generally making the code more readable as I read it. None of these are what I'd call "invasive" but they're more than just cosmetic.
From experience it depends on two things: time and risk.
If you have plenty of time then you can do a lot more, if not then the scope of whatever changes you make is reduced accordingly. As much as I hate doing it I have had to create some horrible shameful hacks because I simply didn't have enough time to do it right...
If the code you are working on has lots of dependencies or is critical to the application then make as few changes as possible - you never know what your fix might break... :)
It sounds like you have a solid idea of what things should look like so I am not going to say what specific changes to make in what order 'cause that will vary from person to person. Just make small localized changes first, test, expand the scope of your changes, test. Expand. Test. Expand. Test. Until you either run out of time or there is no more room for improvement!
BTW When testing you are likely to see where things break most often - create test cases for them (JUnit or whatever).
EXCEPTION:
Two things that I always find myself doing are reformatting (CTRL+SHFT+F in Eclipse) and commenting code that is not obvious. After that I just hammer the most obvious nail first...
While poking around the questions, I recently discovered the assert keyword in Java. At first, I was excited. Something useful I didn't already know! A more efficient way for me to check the validity of input parameters! Yay learning!
But then I took a closer look, and my enthusiasm was not so much "tempered" as "snuffed-out completely" by one simple fact: you can turn assertions off.*
This sounds like a nightmare. If I'm asserting that I don't want the code to keep going if the input listOfStuff is null, why on earth would I want that assertion ignored? It sounds like if I'm debugging a piece of production code and suspect that listOfStuff may have been erroneously passed a null but don't see any logfile evidence of that assertion being triggered, I can't trust that listOfStuff actually got sent a valid value; I also have to account for the possibility that assertions may have been turned off entirely.
And this assumes that I'm the one debugging the code. Somebody unfamiliar with assertions might see that and assume (quite reasonably) that if the assertion message doesn't appear in the log, listOfStuff couldn't be the problem. If your first encounter with assert was in the wild, would it even occur to you that it could be turned-off entirely? It's not like there's a command-line option that lets you disable try/catch blocks, after all.
All of which brings me to my question (and this is a question, not an excuse for a rant! I promise!):
What am I missing?
Is there some nuance that renders Java's implementation of assert far more useful than I'm giving it credit for? Is the ability to enable/disable it from the command line actually incredibly valuable in some contexts? Am I misconceptualizing it somehow when I envision using it in production code in lieu of statements like if (listOfStuff == null) barf();?
I just feel like there's something important here that I'm not getting.
*Okay, technically speaking, they're actually off by default; you have to go out of your way to turn them on. But still, you can knock them out entirely.
Edit: Enlightenment requested, enlightenment received.
The notion that assert is first and foremost a debugging tool goes a long, long way towards making it make sense to me.
I still take issue with the notion that input checks for non-trivial private methods should be disabled in a production environment because the developer thinks the bad inputs are impossible. In my experience, mature production code is a mad, sprawling thing, developed over the course of years by people with varying degrees of skill targeted to rapidly changing requirements of varying degrees of sanity. And even if the bad input really is impossible, a piece of sloppy maintenance coding six months from now can change that. The link gustafc provided (thanks!) includes this as an example:
assert interval > 0 && interval <= 1000/MAX_REFRESH_RATE : interval;
Disabling such a simple check in production strikes me as foolishly optimistic. However, this is a difference in coding philosophy, not a broken feature.
In addition, I can definitely see the value of something like this:
assert reallyExpensiveSanityCheck(someObject) : someObject;
My thanks to everybody who took the time to help me understand this feature; it is very much appreciated.
assert is a useful piece of Design by Contract. In that context, assertions can be used in:
Precondition checks.
Postcondition checks.
Intermediate result checks.
Class invariant checks.
Assertions can be expensive to evaluate (take, for example, the class invariant, which must hold before and after calling any public method of your class). Assertions are typically wanted only in debug builds and for testing purposes; you assert things that can't happen - things which are synonymous of having a bug. Assertions verify your code against its own semantics.
Assertions are not an input validation mechanism. When input could really be correct or wrong in the production environment, i.e. for input-output layers, use other methods, such as exceptions or good old conditional checks.
Java's assertions aren't really made for argument validation - it's specifically stated that assertions are not to be used instead of dear old IllegalArgumentException (and neither is that how they are used in C-ish languages). They are more there for internal validation, to let you make an assumption about the code which isn't obvious from looking at it.
As for turning them off, you do that in C(++), too, just that if someone's got an assert-less build, they have no way to turn it on. In Java, you just restart the app with the appropriate VM parameters.
Every language I've ever seen with assertions comes with the capability of shutting them off. When you write an assertion you should be thinking "this is silly, there's no way in the universe this could ever be false" -- if you think it could be false, it should be an error check. The assertion is just to help you during development if something goes horribly wrong; when you build the code for production you disable them to save time and avoid (hopefully) superfluous checks
Assertions are meant to ensure things you are sure that your code fulfills really are fulfilled. It's an aid in debugging, in the development phase of the product, and is usually omitted when the code is released.
What am I missing?
You're not using assertions the way they were meant to be used. You said "check the validity of input parameters" - that's precisely the sort of things you do not want to verify with assertions.
The idea is that if an assertion fails, you 100% have a bug in your code. Assertions are often used for identifying the bug earlier than it would have surfaced otherwise.
I think its the way assert usage is interpreted and envisioned.
If you really want to add the check in your actual production code, why not use If directly or any other conditional statement?
Those being already present in language, the idea of assert was only to have developer's add assertions only if they don't really expect this condition to ever happen.
E.g checking an object to be null, let's say a developer wrote a private method and called it from two places (this is not ideal example but may works for private methods) in the class where he knows he passes a not null object, instead of adding unnecessary check of if since as of today you know there is no way object would be null
But if someone tomorrow calls this method with null argument, in developer's unit testing this can be caught due to presence of assertion and in final code you still don't need an if check.
Assertions are really a great and concise documentation tool for a code maintainer.
For example I can write:
foo should be non-null and greater
than 0
or put this into the body of the program:
assert foo != null;
assert foo.value > 0;
They are extremely valuable for documenting private/package private methods to express original programmer invariants.
For the added bonus, when the subsystem starts to behave flaky, you can turn asserts on and add extra validation instantly.
This sounds about right. Assertions are just a tool that is useful for debugging code - they should not be turned on all the time, especially in production code.
For example, in C or C++, assertions are disabled in release builds.
If asserts could not be turned off, then why should they even exist.
If you want to performa validity check on an input, you can easily write
if (foobar<=0) throw new BadFoobarException();
or pop up a message box or whatever is useful in context.
The whole point of asserts is that they are something that can be turned on for debugging and turned off for production.
Assertions aren't for the end user to see. They're for the programmer, so you can make sure the code is doing the right thing while it's being developed. Once the testing's done, assertions are usually turned off for performance reasons.
If you're anticipating that something bad is going to happen in production, like listOfStuff being null, then either your code isn't tested enough, or you're not sanitizing your input before you let your code have at it. Either way, an "if (bad stuff) { throw an exception }" would be better. Assertions are for test/development time, not for production.
Use an assert if you're willing to pay $1 to your end-user whenever the assertion fails.
An assertion failure should be an indication of a design error in the program.
An assertion states that I have engineered the program in such a way that I know and guarantee that the specified predicate always holds.
An assertion is useful to readers of my code, since they see that (1) I'm willing to set some money on that property; and (2) in previous executions and test cases the property did hold indeed.
My bet assumes that the client of my code sticks to the rules, and adheres to the contract he and I agreed upon. This contract can be tolerant (all input values allowed and checked for validity) or demanding (client and I agreed that he'll never supply certain input values [described as preconditions], and that he doesn't want me to check for these values over and over again).
If the client sticks to the rules, and my assertions nevertheless fail, the client is entitled to some compensation.
Assertions are to indicate a problem in the code that may be recoverable, or as an aid in debugging. You should use a more destructive mechanism for more serious errors, such as stopping the program.
They can also be used to catch an unrecoverable error before the application fails later in debugging and testing scenarios to help you narrow down a problem. Part of the reason for this is so that integrity checking does not reduce the performance of well-tested code in production.
Also, in certain cases, such as a resource leak, the situation may not be desirable, but the consequences of stopping the program are worse than the consequences of continuing on.
This doesn't directly answer your question about assert, but I'd recommend checking out the Preconditions class in guava/google-collections. It allows you to write nice stuff like this (using static imports):
// throw NPE if listOfStuff is null
this.listOfStuff = checkNotNull(listOfStuff);
// same, but the NPE will have "listOfStuff" as its message
this.listOfStuff = checkNotNull(listOfStuff, "listOfStuff");
It seems like something like this might be what you want (and it can't be turned off).