Java counter-intuitive code - java

I remember reading a book named:
Java Puzzlers Traps, Pitfalls, and Corner Cases
that described odd behavior in Java code. Stuff that look completely innocent but in actuality perform something completely different than the obvious. One example was:
(EDIT: This post is NOT a discussion on this particular example. This was the first example on the book I mentioned. I am asking for other oddities you might have encountered.)
Can this code be used to identify if a number is odd or not?
public static boolean isOdd(int i) {
return i % 2 == 1;
}
And the answer is of course NO. If you plug a negative number into it you get the wrong answer when the number is odd. The correct answer was:
public static boolean isOdd(int i) {
return i % 2 != 0;
}
Now my question is what are the weirdest, most counter intuitive piece of Java code you came across? (I know it's not really a question, maybe I should post this as a community wiki, please advice on that as well)

One I blogged about recently, given the following two classes:
public class Base
{
Base() {
preProcess();
}
void preProcess() {}
}
public class Derived extends Base
{
public String whenAmISet = "set when declared";
#Override void preProcess()
{
whenAmISet = "set in preProcess()";
}
}
what do you think the value of whenAmISet will be when a new Derived object is created?
Given the following simple main class:
public class Main
{
public static void main(String[] args)
{
Derived d = new Derived();
System.out.println( d.whenAmISet );
}
}
most people said it looks like the output should be "set in preProcess()" because the Base constructor calls that method, but it isn't. The Derived class members are initialized after the call to the preProcess() method in the Base constructor, which overwrites the value set in preProcess().
The Creation of New Class Instances section of the JLS gives a very detailed explanation of the sequence of events that takes place when objects are created.

The most counterintuitive concept I came across is the PECS (Producer Extends, Consumer Super) from Josh Bloch. The concept is excellent, but what do you consider the consumer/producer in a situation - the method itself I would think at first. But no, the parameter collection is the P/C in this concept:
public <T> void consumeTs(Collection<? extends T> producer);
public <T> void produceTs(Collection<? super T> consumer);
Very confusing sometimes.

In fact, Java Puzzlers has 94 more puzzles which exhibit sometimes strange and sometimes deceiving behaviors by mostly innocent-looking code.

We once stumbled upon something like this in a legacy code base (originally camouflaged by a 5 level inheritance structure and several indirections):
public abstract class A {
public A() {
create();
}
protected abstract void create();
}
public class B extends A {
private Object bMember=null;
protected void create() {
bMember=getNewObject();
}
}
When you call B constructor, it calls A default constructor, calling B's create() method, through which bMember gets initialized.
Or so we naively thought. Because after calling super(), the next step in the initialization process is the assignment of explicitly defined default values to B members, in effect resetting bMember to null.
Program was actually working fine anyway, because bMember got assigned again through another route later on.
At some point we removed the apparently useless null default value for bMember, and suddenly program behavior changed.

i recently discovered that Math.abs(i) does not always produce positive numbers.
Math.abs(Integer.MIN_VALUE) yields -2^31
why? because there is one less positive Integer than negative. 2^31 is one more than Integer.MAX_VALUE and thus overflows to -2^31
i guess the situation is similar in most other languages, but i encountered this in java

I tend to agree with you, but I have read (and hopefully someone can supply a link or two to) articles that explain that Java's '%' is consistent with its '/', and that is enough for anyone. From my own experience:
Java's '%' operator is a little different from some other languages' in its handling of negative inputs. I personally prefer "modulo" operators that return non-negatives, e.g.
-5 % 2 == 1
Which would make your example work. I think there is an official name for this operation, but I can't think of it now so I'll stick with "modulo". The difference between the two forms is that that Java variant of 'a % b' performs 'a/b' and rounds towards zero (and subtracts that result from 'a'), while the preferred operation rounds down instead.
Every practical use of applying % to negative 'a' and positive 'b' that I've seen works more easily if the result 'r' is '0 <= r < b' (one example is finding the offset from the left-most edge of a tile, when mapping points onto tiles on a plane that may extend '< 0'). The one exception to this experience was in a university assignment that performed static analysis of integer arithmetic in Java programs. It was during this assignment that the subtleties of Java's '%' came to light, and I went out of my way to replace it with the "fixed" version. This all backfired, because the point was to simulate how Java does arithmetic, not to implement my own preferred kind.

Have a look at the methods defined in java.util.concurrent.SynchronousQueue:
http://java.sun.com/j2se/1.5.0/docs/api/java/util/concurrent/SynchronousQueue.html
Half of the methods always return null / true / false / zero:
Not exactly obvious when you first start working with it, without reading the docs first.

Odd and even numbers are usually thought of as positive numbers, so I think don't think its useful to think of negative numbers as odd or even. The test is often really to see if the lowest bit is set.
These are both fine for negative numbers as well.
if ((i & 1) == 0) // lowest bit not set.
if ((i & 1) == 1) // lowest bit set.
or
if ((i & 1) != 0) // lowest bit set.

Related

Correct syntax for making a functional interface instance automatically call its method

I've been watching Douglas Schmidt classes on Parallel Java. He introduces Lambda x method referencing syntax discussion, highlighting how the last one is preferable, as it makes clearer what the code is actually doing, not what the programmer is trying to do with the code, even more than forEach approach.
String[] names = {"first", "Second", "third"};
Arrays.sort(names, (n1,n2) -> n1.compareToIgnoreCase(n2));
Arrays.sort(names, String::compareToIgnoreCase); //preferable
For example, that approach mitigates the chances of programmer making mistakes inside lambda function: passing the wrong argument, inverting arguments order, adding collateral effects, etc.
Then he introduces Functional interfaces, an interface that contains only an abstract method, implementing its own interface runTest with an abstract method factorial():
private static <T> void runTest(Function<T,T> factorial, T n) {
System.out.println(n+ " factorial = " + factorial.apply(n));
}
private static class ParallelStreamFactorial{
static BigInteger factorial(BigInteger n) {
return LongStream
.rangeClosed(1, n.longValue())
.parallel()
.mapToObj(BigInteger::valueOf)
.reduce(BigInteger.ONE, BigInteger::multiply);
}
}
Calling it with the following syntax:
import java.math.BigInteger;
import java.util.function.Function;
import java.util.stream.LongStream;
public static void main(String[] args) {
BigInteger n = BigInteger.valueOf(3);
runTest(ParallelStreamFactorial::factorial, n);
}
The code works and prints
3 factorial = 6
As I'm studying lambdas, I tried to interchange method reference syntax for lambda syntax, and managed to using:
public static void main(String[] args) {
BigInteger n = BigInteger.valueOf(3);
runTest((number)->ParallelStreamFactorial.factorial(number), n);
}
Which also worked.
Then he proceeds to explain built-in interfaces, such as Predicate<T>{boolean test(T t);}, and that's where I got stuck.
I managed to implement a Predicate<Integer> that tests if the integer is bigger than 0 using the three syntaxes:
Instantiating an object myPredicate from a class that implements Predicate<Integer>
Instantiating an object lambdaPredicate from a lambda
Instantiating an object methodReferencePredicatefrom a method reference:
import java.util.function.Function;
import java.util.function.Predicate;
public class MyPredicates {
public static void main(String[] args) {
Predicate<Integer> constructorPredicate = new myPredicate();
System.out.println(constructorPredicate.test(4));
Predicate<Integer> lambdaPredicate = (number)-> number > 0;
System.out.println(lambdaPredicate.test(4));
Predicate<Integer> methodReferencePredicate = myMethodReference::myTest;
System.out.println(methodReferencePredicate.test(4));
}
private static class myPredicate implements Predicate<Integer>{
public boolean test(Integer t) {
return t>0;
}
}
private static class myMethodReference{
public static boolean myTest(Integer t) {
return t>0;
}
}
}
And then calling their .test() methods. They're all three working and printing true.
However I would like to "instantiate and call" everything in a single line, as he did in his example. It seems like his code is inferring the type of the argument passed (I may be wrong) but it's definitely running automatically.
I tried different things:
Predicate<Integer>(myMethodReference::myTest, 4);
Predicate(myMethodReference::myTest, 4);
Predicate<Integer>((number) -> myMethodReference.myTest(number), 4);
Predicate((number) -> myMethodReference.myTest(number), 4);
But none of them work.
They throw:
Syntax error, insert ";" to complete LocalVariableDeclarationStatement
and
The method Predicate(myMethodReference::myTest, int) is undefined for the type MyPredicates
Errors. I also don't even know the name of what he's doing in that single line to properly search better on internet for references.
What's the correct syntax for that, whether by method reference or lambdas?
You've made things far too complicated.
There is no point in lambdas if you want to 'execute them immediately'.
Here is how you run your my test code 'immediately':
System.out.println(number > 4);
Why mess with lambdas? They just make matters confusing here.
The very point of a lambda is two-fold:
A way to transmit code itself to other contexts.
Control flow abstraction.
In java in particular, option 2 is an evil - it makes code ugly, harder to reason about, introduces pointless distractions, and in general should be avoided... unless you're employing it to avoid an even greater evil. That happens plenty - for example, a reasonable 'stream chain' is generally better even though its control flow abstraction. I'd say this:
int total = list.stream()
.filter(x -> x.length() < 5)
.mapToInt(Integer::valueOf)
.sum();
is the lesser evil compared to:
int total = 0;
for (var x : list) {
if (x.length() < 5) continue;
total += Integer.parseInt(x);
}
but it is a pretty close call.
Why is it 'evil'? Because lambdas in java are non transparent in 3 important ways, and this non-transparency is a good thing in the first case, but a bad thing in the second. Specifically, lambdas are not transparent in these ways:
Lambdas cannot change or even read local variables from outer scope unless they are (effectively) final.
Lambdas cannot throw checked exceptions even if the outer scope would handle them (because they catch them or the method you're in declared throws ThatException).
Lambdas cannot do control flow. You can't break, continue, or return from within a lambda to outside of it.
These 3 things are all useful and important things to be doing when you're dealing with basic control flow. Therefore, lambdas should be avoided as you create a bunch of problems and inflexibility by using them... unless you've avoided more complexity and inflexibility of course. It's programming: Nothing is ever easy.
The notion of bundling up code is therefore much more useful, because those non-transparencies turn into upside:
If you take the lambda code and export it to someplace that runs that code much later and in another thread, what does it even mean to modify a local variable at that point? The local variable is long gone (local vars are ordinarily declared on stack and disappear when the method that made them ends. That method has ended; your lambda survived this, and is now running in another context). Do we now start marking local vars as volatile to avoid thead issues? Oof.
The fact that the outer code deals with a checked exception is irrelevant: The lexical scope that was available when you declared the lambda is no longer there, we've long ago moved past it.
Control flow - breaking out of or restarting a loop, or returning from a method. What loop? What method? They have already ended. The code makes no sense.
See? Lambda lack of transparency is in all ways great (because they make no sense), if your lambda is 'travelling'. Hence, lambdas are best used for this, they have no downsides at that point.
Thus, let's talk about travelling lambdas: The very notion is to take code and not execute it. Instead, you hand it off to other code that does whatever it wants. It may run it 2 days from now when someone connects to your web server, using path /foobar. It may run every time someone adds a new entry to a TreeSet in order to figure out where in the tree the item should be placed (that's precisely the fate of the lambda you pass to new TreeSet<X>((a, b) -> compare-a-and-b-here).
Even in control flow situations (which are to be avoided if possible), your lambda still travels, it just travels to place that does immediately ends up using it, but the point of the lambda remains control flow abstraction: You don't run the code in it, you hand your lambda off to something else which will then immediately run that 0 to many times. That's exactly what is happening here:
list.forEach(System.out::println);
I'm taking the code notion of System.out.println(someString), and I don't run it - no, I bundle up that idea in a lambda and then pass this notion to list's forEach method which will then invoke it for me, on every item in the list. As mentioned, this is bad code, because it needlessly uses lambdas in control flow abstraction mdoe which is inferior to just for (var item : list) System.out.println(item);, but it gets the point across.
It just doesn't make sense to want to write a lambda and immediately execute it. Why not just... execute it?
In your example from the book, you don't actually execute the lambda as you make it. You just.. make it, and hand it off to the runTest method, and it runs it. The clue is, runTest is a method (vs your attempts - Predicate is not a method), it's not magical or weird, just.. a method, that so happens to take a Function<A, B> as argument, and the lambda you write so happens to 'fit' - it can be interpreted as an implementation of Function<A, B>, and thus that code compiles and does what it does.
You'd have to do the same thing.
But, if that code is a single-use helper method, then there's no point to the lambda in the first place.

About Recursion for Newbies in Java

So, I have this code, which is just the way I solved an exercise that was given to me, which consisted of creating a recursive function that received a number, and then gave you the sum of 1, all the numbers in between, and your number. I know I made it sound confusing, but here's an example:
If I inserted the number 5, then the returned value would have to be 15, because: 1+2+3+4+5 = 15.
public class Exercise {
public static void main(String[] args) {
int returnedValue = addNumbers(6);
System.out.print(returnedValue);
}
public static int addNumbers(int value) {
if (value == 1) return value;
return value = value + addNumbers(value-1);
}
}
Technically speaking, my code works just fine, but I still don't get why Eclipse made me write two returns, that's all I would like to know.
Is there a way I could only write "return" once?
Sure, you can write it with just one return:
public static int addNumbers(int value) {
if (value > 1) {
value += addNumbers(value - 1);
}
return value;
}
As you can see, it's done by having some variable retain the running result until you get to the end. In this case I was able to do it in-place in value, in other cases you may need to create a local variable, but the idea of storing your intermediate result somewhere until you get to the return point is a general one.
There should be two returns. Your first return says
if at 1: stop recurstion
and the second one says
continue recursion by returning my value plus computing the value less than me
You could combine them by using a ternary:
return value == 1 ? value : value + addNumbers(value - 1)
But it is not as readable.
Recursive funtions like
Fibbonacci's sequence
Fractals
Etc.
Use themselves multiple times because they contain themselves.
Feel free to correct me if I'm wrong, but I don't think there is a way to eliminate one of those returns unless you decide to put a variable outside of the method or change the method from being recursive.
In java, a method that returns a value, MUST return a value at some point, no matter what code inside of it does. The reason eclipse requires you to add the second return, is because the first return is only run if your if statement evaluates to true. If you didn't have the second return, and that if statement did not end up being true, java would not be able to leave that method, and would have no idea what to do, thus, eclipse will require you to add a return statement after that if statement.
These types of errors are called checked errors or compile time errors. This means that eclipse literally can not convert your code into a runnable file, because it does not know how; there is a syntax error, or you are missing a return, etc.
Recursive functions always have at least 2 paths, the normal ones that will recurse and the "end" paths that just return (Usually a constant).
You could, however, do something like this:
public static int addNumbers(int value) {
if (value != 1)
value = value + addNumbers(value-1);
return value;
}
But I can't say I think it's much better (Some people get as annoyed at modifying parameters as they do at multiple returns). You could, of course, create a new variable and set it to one value or the other, but then someone would get upset because you used too many lines of code and an unnecessary variable. Welcome to programming :) Your original code is probably as good as you're likely to get.
As for why "Eclipse" did that to you, it's actually Java--Java is better than most languages at making sure you didn't do something clearly wrong as soon as possible (In this case while you are typing instead of waiting for you to compile). It detected that one branch of your if returned a value and the other did not--which is clearly wrong.
Java is also very explicit forcing you to use a "return" statement where another language might let you get away with less. In Groovy You'd be tempted to eliminate the return and write something like:
def addNumbers(value){value + (value-1?0:addNumbers(value-1))}
just for fun but I certainly wouldn't call THAT more readable! Java just figures it's better to force you to be explicit in most cases.
From the wikipedia on recursion:
In mathematics and computer science, a class of objects or methods
exhibit recursive behavior when they can be defined by two properties:
A simple base case (or cases)—a terminating scenario that does not use recursion to produce an answer
A set of rules that reduce all other cases toward the base case
There are two returns because you have to handle the two cases above. In your example:
The base case is value == 1.
The case to reduce all other cases toward the base case is value + addNumbers(value-1);.
Source: https://en.wikipedia.org/wiki/Recursion#Formal_definitions
Of course there are other ways to write this, including some that do not require multiple returns, but in general multiple returns are a clear and normal way to express recursion because of the way recursion naturally falls into multiple cases.
Like the robots of Asimov, all recursive algorithms must obey three important laws:
A recursive algorithm must have a base case.
A recursive algorithm must change its state and move toward the base case.
A recursive algorithm must call itself, recursively.
Your if (value == 1) return value; return statement is the base case. That's when the recursion(calling itself) stops. When a function call happen, the compiler pushes the current state to stack then make the call. So, when that call has returned some value, it pulls the value from stack, makes calculation and return the result to upper level. That's why the other return statement is for.
Think of this like breaking up your problem:
addNumbers(3) = 3 + addNumbers(2) (this is returned by second one)
-> 2 + addNumbers(1) (this is returned by second one)
-> 1 (this is returned by base case)

Java How to define a fundamental operation in a recursive method?

In a recursive method, we will have to set up some base cases. Can we define these base cases as the fundamental operations of the method? As far as I know, fundamental operations are the core functions of the method, which means every time the method is running must pass through these functions. (Please let me know if I am wrong)
Another question is if I got a if statement in the base cases, for example,
if (a != b && b != c){}
Will it be count as 1 or 2 fundamental operations? Cause it is checking 2 part of things: a != b and b != c in a single if statement.
This is quite confused.
One more thing is : I am not sure will this kind of code is suitable or not:
Recursive method()
Base case:
XXXXXXX
XXXXXXX
XXXXXXX
Recursive case:
// Should I move this part to the base case?
if(conditions){
return (Recursive method() || Recursive method());
// ********************************************************
else if(conditions){
do something;
return Recursive method();
Cause I think base cases are just used to defined a exact value when meeting the conditions. Therefore I leave this part in the recursive case. I am just not so sure about this.
I am not asking for answer for coursework, just to ensure the concept is correct or not. So I didn't put my algorithm on here. Sorry if this makes you cant understand my question. I will try my best to explain.
Thanks.
Based on the definitions provided, a base case or terminating case is a condition which stops the recursion calls.
The definition of a fundamental operation is a bit unclear from the question and I am honestly getting lost in here. But from my understanding it's or it should be a set of operations which are done in a function regardless the base case. Link to the definition would help!
Let's have a short example:
/**
* Let's assume the result is not obvious here regardless it's a nth triangular number.
* Added there a System.out though.
*/
public void calc(int i) {
System.out.println(i);
if (i == 0)
return 0;
return i + calc(i - 1);
}
The only case how to stop the recursion evaluating the first condition if (i == 0) to true. Meaning that this condition represents the base case. When i has any other value than zero, the recursion continues.
In the example, the only operation which is done regardless the outcome of the method is printing the value i. Thus that's the only fundamental operation of the method. (Based on the definition, an evaluation of a condition might and might not be considered an operation since it's not changing any value nor has any side effects, like printing.)
Can we define these base cases as the fundamental operations of the method?
Generally no, as these represent different situations. The first one defines a case when recursion is stopped and the second one what's always done. Thus if you combine them, you end up with a function which always stops the recursion.
For the following case they could be the same though.
public void calc(int i) {
return i + 1;
}

Why does compareTo return an integer

I recently saw a discussion in an SO chat but with no clear conclusions so I ended up asking there.
Is this for historical reasons or consistency with other languages? When looking at the signatures of compareTo of various languages, it returns an int.
Why it doesn't return an enum instead. For example in C# we could do:
enum CompareResult {LessThan, Equals, GreaterThan};
and :
public CompareResult CompareTo(Employee other) {
if (this.Salary < other.Salary) {
return CompareResult.LessThan;
}
if (this.Salary == other.Salary){
return CompareResult.Equals;
}
return CompareResult.GreaterThan;
}
In Java, enums were introduced after this concept (I don't remember about C#) but it could have been solved by an extra class such as:
public final class CompareResult {
public static final CompareResult LESS_THAN = new Compare();
public static final CompareResult EQUALS = new Compare();
public static final CompareResult GREATER_THAN = new Compare();
private CompareResult() {}
}
and
interface Comparable<T> {
Compare compareTo(T obj);
}
I'm asking this because I don't think an int represents well the semantics of the data.
For example in C#,
l.Sort(delegate(int x, int y)
{
return Math.Min(x, y);
});
and its twin in Java 8,
l.sort(Integer::min);
compiles both because Min/min respect the contracts of the comparator interface (take two ints and return an int).
Obviously the results in both cases are not the ones expected. If the return type was Compare it would have cause a compile error thus forcing you to implement a "correct" behavior (or at least you are aware of what you are doing).
A lot of semantic is lost with this return type (and potentially can cause some difficult bugs to find), so why design it like this?
[This answer is for C#, but it probably also apples to Java to some extent.]
This is for historical, performance and readability reasons. It potentially increases performance in two places:
Where the comparison is implemented. Often you can just return "(lhs - rhs)" (if the values are numeric types). But this can be dangerous: See below!
The calling code can use <= and >= to naturally represent the corresponding comparison. This will use a single IL (and hence processor) instruction compared to using the enum (although there is a way to avoid the overhead of the enum, as described below).
For example, we can check if a lhs value is less than or equal to a rhs value as follows:
if (lhs.CompareTo(rhs) <= 0)
...
Using an enum, that would look like this:
if (lhs.CompareTo(rhs) == CompareResult.LessThan ||
lhs.CompareTo(rhs) == CompareResult.Equals)
...
That is clearly less readable and is also inefficient since it is doing the comparison twice. You might fix the inefficiency by using a temporary result:
var compareResult = lhs.CompareTo(rhs);
if (compareResult == CompareResult.LessThan || compareResult == CompareResult.Equals)
...
It's still a lot less readable IMO - and it's still less efficient since it's doing two comparison operations instead of one (although I freely admit that it is likely that such a performance difference will rarely matter).
As raznagul points out below, you can actually do it with just one comparison:
if (lhs.CompareTo(rhs) != CompareResult.GreaterThan)
...
So you can make it fairly efficient - but of course, readability still suffers. ... != GreaterThan is not as clear as ... <=
(And if you use the enum, you can't avoid the overhead of turning the result of a comparison into an enum value, of course.)
So this is primarily done for reasons of readability, but also to some extent for reasons of efficiency.
Finally, as others have mentioned, this is also done for historical reasons. Functions like C's strcmp() and memcmp() have always returned ints.
Assembler compare instructions also tend to be used in a similar way.
For example, to compare two integers in x86 assembler, you can do something like this:
CMP AX, BX ;
JLE lessThanOrEqual ; jump to lessThanOrEqual if AX <= BX
or
CMP AX, BX
JG greaterThan ; jump to greaterThan if AX > BX
or
CMP AX, BX
JE equal ; jump to equal if AX == BX
You can see the obvious comparisons with the return value from CompareTo().
Addendum:
Here's an example which shows that it's not always safe to use the trick of subtracting the rhs from the lhs to get the comparison result:
int lhs = int.MaxValue - 10;
int rhs = int.MinValue + 10;
// Since lhs > rhs, we expect (lhs-rhs) to be +ve, but:
Console.WriteLine(lhs - rhs); // Prints -21: WRONG!
Obviously this is because the arithmetic has overflowed. If you had checked turned on for the build, the code above would in fact throw an exception.
For this reason, the optimization of suusing subtraction to implement comparison is best avoided. (See comments from Eric Lippert below.)
Let's stick to bare facts, with absolute minumum of handwaving and/or unnecessary/irrelevant/implementation dependent details.
As you already figured out yourself, compareTo is as old as Java (Since: JDK1.0 from Integer JavaDoc); Java 1.0 was designed to be familiar to C/C++ developers, and mimicked a lot of it's design choices, for better or worse. Also, Java has a backwards compatibility policy - thus, once implemented in core lib, the method is almost bound to stay in it forever.
As to C/C++ - strcmp/memcmp, which existed for as long as string.h, so essentially as long as C standard library, return exactly the same values (or rather, compareTo returns the same values as strcmp/memcmp) - see e.g. C ref - strcmp. At the time of Java's inception going that way was the logical thing to do. There weren't any enums in Java at that time, no generics etc. (all that came in >= 1.5)
The very decision of return values of strcmp is quite obvious - first and foremost, you can get 3 basic results in comparison, so selecting +1 for "bigger", -1 for "smaller" and 0 for "equal" was the logical thing to do. Also, as pointed out, you can get the value easily by subtraction, and returning int allows to easily use it in further calculations (in a traditional C type-unsafe way), while also allowing efficient single-op implementation.
If you need/want to use your enum based typesafe comparison interface - you're free to do so, but since the convention of strcmp returning +1/0/-1 is as old as contemporary programming, it actually does convey semantic meaning, in the same way null can be interpreted as unknown/invalid value or a out of bounds int value (e.g. negative number supplied for positive-only quality) can be interpreted as error code. Maybe it's not the best coding practice, but it certainly has its pros, and is still commonly used e.g. in C.
On the other hand, asking "why the standard library of language XYZ does conform to legacy standards of language ABC" is itself moot, as it can only be accurately answered by the very language designed who implemented it.
TL;DR it's that way mainly because it was done that way in legacy versions for legacy reasons and POLA for C programmers, and is kept that way for backwards-compatibility & POLA, again.
As a side note, I consider this question (in its current form) too broad to be answered precisely, highly opinion-based, and borderline off-topic on SO due to directly asking about Design Patterns & Language Architecture.
This practice comes from comparing integers this way, and using a subtract between first non-matching chars of a string.
Note that this practice is dangerous with things that are partially comparable while using a -1 to mean that a pair of things was incomparable. This is because it could create a situation of a < b and b < a (which the application might use to define "incomparable"). Such a situation can lead to loops that don't terminate correctly.
An enumeration with values {lt,eq,gt,incomparable} would be more correct.
My understanding is that this is done because you can order the results (i.e., the operation is reflexive and transitive). For example, if you have three objects (A,B,C) you can compare A->B and B->C, and use the resulting values to order them properly. There is an implied assumption that if A.compareTo(B) == A.compareTo(C) then B==C.
See java's comparator documentation.
Reply this is due to performance reasons.
If you need to compare int as often happens you can return the following:
Infact comparison are often returned as substractions.
As an example
public class MyComparable implements Comparable<MyComparable> {
public int num;
public int compareTo(MyComparable x) {
return num - x.num;
}
}

Reordering arguments using recursion (pro, cons, alternatives)

I find that I often make a recursive call just to reorder arguments.
For example, here's my solution for endOther from codingbat.com:
Given two strings, return true if either of the strings appears at the very end of the other string, ignoring upper/lower case differences (in other words, the computation should not be "case sensitive"). Note: str.toLowerCase() returns the lowercase version of a string.
public boolean endOther(String a, String b) {
return a.length() < b.length() ? endOther(b, a)
: a.toLowerCase().endsWith(b.toLowerCase());
}
I'm very comfortable with recursions, but I can certainly understand why some perhaps would object to it.
There are two obvious alternatives to this recursion technique:
Swap a and b traditionally
public boolean endOther(String a, String b) {
if (a.length() < b.length()) {
String t = a;
a = b;
b = t;
}
return a.toLowerCase().endsWith(b.toLowerCase());
}
Not convenient in a language like Java that doesn't pass by reference
Lots of code just to do a simple operation
An extra if statement breaks the "flow"
Repeat code
public boolean endOther(String a, String b) {
return (a.length() < b.length())
? b.toLowerCase().endsWith(a.toLowerCase())
: a.toLowerCase().endsWith(b.toLowerCase());
}
Explicit symmetry may be a nice thing (or not?)
Bad idea unless the repeated code is very simple
...though in this case you can get rid of the ternary and just || the two expressions
So my questions are:
Is there a name for these 3 techniques? (Are there more?)
Is there a name for what they achieve? (e.g. "parameter normalization", perhaps?)
Are there official recommendations on which technique to use (when)?
What are other pros/cons that I may have missed?
Another example
To focus the discussion more on the technique rather than the particular codingbat problem, here's another example where I feel that the recursion is much more elegant than a bunch of if-else's, swaps, or repetitive code.
// sorts 3 values and return as array
static int[] sort3(int a, int b, int c) {
return
(a > b) ? sort3(b, a, c) :
(b > c) ? sort3(a, c, b) :
new int[] { a, b, c };
}
Recursion and ternary operators don't bother me as much as it bothers some people; I honestly believe the above code is the best pure Java solution one can possibly write. Feel free to show me otherwise.
Let’s first establish that code duplication is usually a bad idea.
So whatever solution we take, the logic of the method should only be written once, and we need a means of swapping the arguments around that does not interfere with the logic.
I see three general solutions to that:
Your first recursion (either using if or the conditional operator).
swap – which, in Java, is a problem, but might be appropriate in other languages.
Two separate methods (as in #Ha’s solution) where one acts as the implementation of the logic and the other as the interface, in this case to sort out the parameters.
I don’t know which of these solutions is objectively the best. However, I have noticed that there are certain algorithms for which (1) is generally accepted as the idiomatic solution, e.g. Euklid’s algorithm for calculating the GCD of two numbers.
I am generally averse to the swap solution (2) since it adds an extra call which doesn’t really do anything in connection with the algorithm. Now, technically this isn’t a problem – I doubt that it would be less efficient than (1) or (3) using any decent compiler. But it adds a mental speed-bump.
Solution (3) strikes me as over-engineered although I cannot think of any criticism except that it’s more text to read. Generally, I don’t like the extra indirection introduced by any method suffixed with “Impl”.
In conclusion, I would probably prefer (1) for most cases although I have in fact used (3) in similar circumstances.
Another +1 for "In any case, my recommendation would be to do as little in each statement as possible. The more things that you do in a single statement, the more confusing it will be for others who need to maintain your code."
Sorry but your code:
// sorts 3 values and return as array
static int[] sort3(int a, int b, int c) {
return
(a > b) ? sort3(b, a, c) :
(b > c) ? sort3(a, c, b) :
new int[] { a, b, c };
}
It's perhaps for you the best "pure java code", but for me it's the worst... unreadable code, if we don't have the method or the comment we just can't know at first sight what it's doing...
Hard to read code should only be used when high performances are needed (but anyway many performances problems are due to bad architecture...). If you HAVE TO write such code, the less you can do is to make a good javadoc and unit tests... we developper often don't care about implementation of such methods if we just have to use it, and not to rework it... but since the first sight doesn't tell us what is does, we can have to trust it works like we expect it does and we can loose time...
Recursive methods are ok when it's a short method, but i think a recursive method should be avoided if the algorithm is complex and if there's another way to do it for almost the same computation time... Particulary if other peoples will prolly work in this method.
For your exemple it's ok since it's a short method, but anyway if you'r just not concerned by performances you could have used something like that:
// sorts int values
public static int[] sort(Integer... intValues) {
ArrayList list = new ArrayList(
for ( Integer i : intValues ) {
list.add(i);
}
Collections.sort(list);
return list.toArray();
}
A simple way to implement your method, easily readable by all java >= 1.5 developper, that works for 1 to n integers...
Not the fastest but anyway if it's just about speed use c++ or asm :)
For this particular example, I wouldn't use anything you suggested.. I would instead write:
public boolean endOther(String a, String b){
String alower=a.toLowerCase();
String blower=b.toLowerCase();
if ( a.length() < b.length() ){
return blower.endsWith(alower);
} else {
return alower.endsWith(blower);
}
}
While the ternary operator does have its place, the if statement is often more intelligible, especially when the operands are fairly complex. In addition, if you repeat code in different branches of an if statement, they will only be evaluated in the branch that is taken (in many programming languages, both operands of the ternary operator are evaluated no matter which branch is selected). While, as you have pointed out, this is not a concern in Java, many programmers have used a variety of languages and might not remember this level of detail, and so it is best to use the ternary operator only with simple operands.
One frequently hears of "recursive" vs. "iterative"/"non-recursive" implementations. I have not heard of any particular names for the various options that you have given.
In any case, my recommendation would be to do as little in each statement as possible. The more things that you do in a single statement, the more confusing it will be for others who need to maintain your code.
In terms of your complaint about repetitiion... if there are several lines that are being repated, then it is time to create a "helper" function that does that part. Function composition is there to reduce repitition. Swapping just doesn't make any sense to do, since there is more effort to swap than to simply repeat... also, if code later in the function uses the parameters, the parameters now mean different things than they used to.
EDIT
My argument vis-a-vis the ternary operator was not a valid one... the vast majority of programming languages use lazy evalution with the ternary operator (I was thinking of Verilog at the time of writing, which is a hardware description language (HDL) in which both branches are evaluated in parallel). That said, there are valid reasons to avoid using complicated expressions in ternary operators; for example, with an if...else statement, it is possible to set a breakpoint on one of the conditional branches whereas, with the ternary operator, both branches are part of the same statement, so most debuggers won't split on them.
It is slightly better to use another method instead of recursion
public boolean endOther(String a, String b) {
return a.length() < b.length() ? endOtherImpl(b,a):endOtherImpl(a,b);
}
protected boolean endOtherImpl(String longStr,String shortStr)
{
return longStr.toLowerCase().endsWith(shortStr.toLowerCase());
}

Categories