I'm trying to delete the middle node from a linked list given access to that node. I'm wondering if there is a difference between the two methods below, or do they accomplish the same thing?
public boolean deleteMiddle(Node middle){
Node next = middle.next; //line 2
middle.data = next.data;
middle.next = next.next;
return true;
}
public boolean deleteMiddle(Node middle){
middle.data = middle.next.data;
middle.next = middle.next.next;
return true;
}
The first method is what the textbook recommended but it seems like creating the Node "next" in the first method(line 2) is an unnecessary line of code.
I think you're probably right that they are equivalent (or certainly look that way)
In both cases it looks like there is a null pointer exception (on next.next) if the item you're deleting is the last in the list (e.g. if next is null then next.next is an error).
And if you get passed null of course that's also going to be a NPE.
creating the Node "next" in the first method(line 2) is an unnecessary line of code.
Both snippets achieve the same result.
But what the extra variable assignment (it does not "create a Node", it just assigns a new name to an existing one) does is avoiding to call middle.next twice, avoiding doing the same "calculation" twice.
In this example that won't make any real difference, but in general, it can be a usual performance optimization to avoid redundant work (especially if method calls are involved that do some "heavy lifting"). Before engaging in these adventures, though, one should think about if it is worth making the code potentially less legible, especially given that the JVM will try to do all kinds of optimization automatically. So apply this mindset only to simple patterns or real bottlenecks.
On top of that, giving names to intermediate results can also lead to more self-explanatory code (also not much difference here).
Final remarks: Sometimes (but not here) it is necessary to introduce extra variables to store things temporarily that would otherwise be overwritten in the course of the operation, such as the famous int h = x; x = y; y = h; example.
Yes, they're equivalent. I prefer the first one because it avoids the repetition of the expression middle.next.
As #Rick pointed out, when middle is the last element, you'll get an NPE in both cases.
Related
down bellow you can see two example methods, which are structured in the same way, but have to work with completely different integers.
You can guess if the code gets longer, it is pretty anoying to have a second long method which is doing the same.
Do you have any idea, how i can combine those two methods without using "if" or "switch" statements at every spot?
Thanks for your help
public List<> firstTestMethod(){
if(blabla != null){
if(blabla.getChildren().size() > 1){
return blabla.getChildren().subList(2, blabla.getChildren().size());
}
}
return null;
}
And:
public List<> secondTestMethod(){
if(blabla != null){
if(blabla.getChildren().size() > 4){
return blabla.getChildren().subList(0, 2);
}
}
return null;
}
Attempting to isolate common ground from 2 or more places into its own Helper method is not a good idea if you're just looking at what the code does without any context.
The right approach is first to define what you're actually isolating. It's not so much about the how (the fact that these methods look vaguely similar suggests that the how is the same, yes), but the why. What do these methods attempt to accomplish?
Usually, the why is also mostly the same. Rarely, the why is completely different, and the fact that the methods look similar is a pure coincidence.
Here's a key takeaway: If the why is completely different but the methods look somewhat similar, you do not want to turn them into a single method. DRY is a rule of thumb, not a commandment!
Thus, your question isn't directly answerable, because the 2 snippets are so abstractly named (blabla isn't all that informative), it's not possible to determine with the little context the question provides what the why might be.
Thus, answer the why question first, and usually the strategy on making a single method that can cater to both snippets here becomes trivial.
Here is an example answer: If list is 'valid', return the first, or last, X elements inside it. Validity is defined as follows: The list is not null, and contains at least Z entries. Otherwise, return null.
That's still pretty vague, and dangerously close to a 'how', but it sounds like it might describe what you have here.
An even better answer would be: blabla represents a family; determine the subset of children who are eligible for inheriting the property.
The reason you want this is twofold:
It makes it much easier to describe a method. A method that seems to do a few completely unrelated things and is incapable of describing the rhyme or reason of any of it cannot be understood without reading the whole thing through, which takes a long time and is error-prone. A large part of why you want methods in the first place is to let the programmer (the human) abstract ideas away. Instead of remembering what these 45 lines do, all you need to remember is 'fetch the eligible kids'.
Code changes over time. Bugs are found and need fixing. External influences change around you (APIs change, libraries change, standards change). Feature requests are a thing. Without the why part it is likely that one of the callers of this method grows needs that this method cannot provide, and then the 'easiest' (but not best!) solution is to just add the functionality to this method. The method will eventually grow into a 20 page monstrosity doing completely unrelated things, and having 50 parameters. To guard against this growth, define what the purpose of this method is in a way that is unlikely to spiral into 'read this book to understand what all this method is supposed to do'.
Thus, your question is not really answerable, as the 2 snippets do not make it obvious what the common thread might be, here.
Why do these methods abuse null? You seem to think null means empty list. It does not. Empty list means empty list. Shouldn't this be returning e.g. List.of instead of null? Once you fix that up, this method appears to simply be: "Give me a sublist consisting of everything except the first two elements. If the list is smaller than that or null, return an empty list", which is starting to move away from the 'how' and slowly towards a 'what' and 'why'. There are only 2 parameters to this generalized concept: The list, and the # of items from the start that need to be omitted.
The second snippet, on the other hand, makes no sense. Why return the first 3 elements, but only if the list has 5 or more items in it? What's the link between 3 and 5? If the answer is: "Nothing, it's a parameter", then this conundrum has far more parameters than the first snippet, and we see that whilst the code looks perhaps similar, once you start describing the why/what instead of the how, these two jobs aren't similar at all, and trying to shoehorn these 2 unrelated jobs into a single method is just going to lead to bad code now, and worse code later on as changes occur.
Let's say instead that this last snippet is trying to return all elements except the X elements at the end, returning an empty list if there are fewer than X. This matches much better with the first snippet (which does the same thing, except replace 'at the end' with 'at the start'). Then you could write:
// document somewhere that `blabla.getChildren()` is guaranteed to be sorted by age.
/** Returns the {#code numEldest} children. */
public List<Child> getEldest(int numEldest) {
if (numEldest < 0) throw new IllegalArgumentException();
return getChildren(numEldest, true);
}
/** Returns all children except the {#code numEldest} ones. */
public List<Child> getAllButEldest(int numEldest) {
if (numEldest < 0) throw new IllegalArgumentException();
return getChildren(numEldest, false);
}
private List<Child> getChildren(int numEldest, boolean include) {
if (blabla == null) return List.of();
List<Child> children = blabla.getChildren();
if (numEldest >= children.size()) return include ? children : List.of();
int startIdx = include ? 0 : numEldest;
int endIdx = include ? numEldest : children.size();
return children.subList(startIdx, endIdx);
}
Note a few stylistic tricks here:
boolean parameters are bad, because why would you know 'true' matches up with 'I want the eldest' and 'false' matches up with 'I want the youngest'? Names are good. This snippet has 2 methods that make very clear what they do, by using names.
That 'when extracting common ground, define the why, not the how' is a hierarchical idea - apply it all the way down, and as you get further away from the thousand-mile view, the what and how become more and more technical. That's okay. The more down to the details you get, the more private things should be.
By having defined what this all actually means, note that the behaviour is subtly different: If you ask for the 5 eldest children and there are only 4 children, this returns those 4 children instead of null. That shows off some of the power of defining the 'why': Now it's a consistent idea. Returning all 4 when you ask for 'give me the 5 eldest', is no doubt 90%+ of all those who get near this code would assume happens.
Preconditions, such as what comprises sane inputs, should always be checked. Here, we check if the numEldest param is negative and just crash out, as that makes no sense. Checks should be as early as they can reasonably be made: That way the stack traces are more useful.
You can pass objects that encapsulate the desired behavior differences at various points in your method. Often you can use a predefined interface for behavior encapsulation (Runnable, Callable, Predicate, etc.) or you may need to define your own.
public List<> testMethod(Predicate<BlaBlaType> test,
Function<BlaBlaType, List<>> extractor)
{
if(blabla != null){
if(test.test(blabla)){
return extractor.apply(blabla);
}
}
return null;
}
You could then call it with a couple of lambdas:
testMethod(
blabla -> blabla.getChildren().size() > 1,
blabla -> blabla.getChildren().subList(2, blabla.getChildren().size())
);
testMethod(
blabla -> blabla.getChildren().size() > 4,
blabla -> blabla.getChildren().subList(0, 2)
);
Here is one approach. Pass a named boolean to indicate which version you want. This also allows the list of children to be retrieved independent of the return. For lack of more meaningful names I choose START and END to indicate which parts of the list to return.
static boolean START = true;
static boolean END = false;
public List<Children> TestMethod(boolean type) {
if (blabla != null) {
List<Children> list = blabla.getChildren();
int size = list.size();
return START ?
(size > 1 ? list.subList(0, 2) : null) :
(size > 4 ? list.subList(2, size) :
null);
}
return null;
}
So, I have this code, which is just the way I solved an exercise that was given to me, which consisted of creating a recursive function that received a number, and then gave you the sum of 1, all the numbers in between, and your number. I know I made it sound confusing, but here's an example:
If I inserted the number 5, then the returned value would have to be 15, because: 1+2+3+4+5 = 15.
public class Exercise {
public static void main(String[] args) {
int returnedValue = addNumbers(6);
System.out.print(returnedValue);
}
public static int addNumbers(int value) {
if (value == 1) return value;
return value = value + addNumbers(value-1);
}
}
Technically speaking, my code works just fine, but I still don't get why Eclipse made me write two returns, that's all I would like to know.
Is there a way I could only write "return" once?
Sure, you can write it with just one return:
public static int addNumbers(int value) {
if (value > 1) {
value += addNumbers(value - 1);
}
return value;
}
As you can see, it's done by having some variable retain the running result until you get to the end. In this case I was able to do it in-place in value, in other cases you may need to create a local variable, but the idea of storing your intermediate result somewhere until you get to the return point is a general one.
There should be two returns. Your first return says
if at 1: stop recurstion
and the second one says
continue recursion by returning my value plus computing the value less than me
You could combine them by using a ternary:
return value == 1 ? value : value + addNumbers(value - 1)
But it is not as readable.
Recursive funtions like
Fibbonacci's sequence
Fractals
Etc.
Use themselves multiple times because they contain themselves.
Feel free to correct me if I'm wrong, but I don't think there is a way to eliminate one of those returns unless you decide to put a variable outside of the method or change the method from being recursive.
In java, a method that returns a value, MUST return a value at some point, no matter what code inside of it does. The reason eclipse requires you to add the second return, is because the first return is only run if your if statement evaluates to true. If you didn't have the second return, and that if statement did not end up being true, java would not be able to leave that method, and would have no idea what to do, thus, eclipse will require you to add a return statement after that if statement.
These types of errors are called checked errors or compile time errors. This means that eclipse literally can not convert your code into a runnable file, because it does not know how; there is a syntax error, or you are missing a return, etc.
Recursive functions always have at least 2 paths, the normal ones that will recurse and the "end" paths that just return (Usually a constant).
You could, however, do something like this:
public static int addNumbers(int value) {
if (value != 1)
value = value + addNumbers(value-1);
return value;
}
But I can't say I think it's much better (Some people get as annoyed at modifying parameters as they do at multiple returns). You could, of course, create a new variable and set it to one value or the other, but then someone would get upset because you used too many lines of code and an unnecessary variable. Welcome to programming :) Your original code is probably as good as you're likely to get.
As for why "Eclipse" did that to you, it's actually Java--Java is better than most languages at making sure you didn't do something clearly wrong as soon as possible (In this case while you are typing instead of waiting for you to compile). It detected that one branch of your if returned a value and the other did not--which is clearly wrong.
Java is also very explicit forcing you to use a "return" statement where another language might let you get away with less. In Groovy You'd be tempted to eliminate the return and write something like:
def addNumbers(value){value + (value-1?0:addNumbers(value-1))}
just for fun but I certainly wouldn't call THAT more readable! Java just figures it's better to force you to be explicit in most cases.
From the wikipedia on recursion:
In mathematics and computer science, a class of objects or methods
exhibit recursive behavior when they can be defined by two properties:
A simple base case (or cases)—a terminating scenario that does not use recursion to produce an answer
A set of rules that reduce all other cases toward the base case
There are two returns because you have to handle the two cases above. In your example:
The base case is value == 1.
The case to reduce all other cases toward the base case is value + addNumbers(value-1);.
Source: https://en.wikipedia.org/wiki/Recursion#Formal_definitions
Of course there are other ways to write this, including some that do not require multiple returns, but in general multiple returns are a clear and normal way to express recursion because of the way recursion naturally falls into multiple cases.
Like the robots of Asimov, all recursive algorithms must obey three important laws:
A recursive algorithm must have a base case.
A recursive algorithm must change its state and move toward the base case.
A recursive algorithm must call itself, recursively.
Your if (value == 1) return value; return statement is the base case. That's when the recursion(calling itself) stops. When a function call happen, the compiler pushes the current state to stack then make the call. So, when that call has returned some value, it pulls the value from stack, makes calculation and return the result to upper level. That's why the other return statement is for.
Think of this like breaking up your problem:
addNumbers(3) = 3 + addNumbers(2) (this is returned by second one)
-> 2 + addNumbers(1) (this is returned by second one)
-> 1 (this is returned by base case)
Lets say I have the following code:
private Rule getRuleFromResult(Fact result){
Rule output=null;
for (int i = 0; i < rules.size(); i++) {
if(rules.get(i).getRuleSize()==1){output=rules.get(i);return output;}
if(rules.get(i).getResultFact().getFactName().equals(result.getFactName())) output=rules.get(i);
}
return output;
}
Is it better to leave it as it is or to change it as follows:
private Rule getRuleFromResult(Fact result){
Rule output=null;
Rule current==null;
for (int i = 0; i < rules.size(); i++) {
current=rules.get(i);
if(current.getRuleSize()==1){return current;}
if(current.getResultFact().getFactName().equals(result.getFactName())) output=rules.get(i);
}
return output;
}
When executing, program goes each time through rules.get(i) as if it was the first time, and I think it, that in much more advanced example (let's say as in the second if) it takes more time and slows execution. Am I right?
Edit: To answer few comments at once: I know that in this particular example time gain will be super tiny, but it was just to get the general idea. I noticed I tend to have very long lines object.get.set.change.compareTo... etc and many of them repeat. In scope of whole code that time gain can be significant.
Your instinct is correct--saving intermediate results in a variable rather than re-invoking a method multiple times is faster. Often the performance difference will be too small to measure, but there's an even better reason to do this--clarity. By saving the value into a variable, you make it clear that you are intending to use the same value everywhere; if you re-invoke the method multiple times, it's unclear if you are doing so because you are expecting it to return different results on different invocations. (For instance, list.size() will return a different result if you've added items to list in between calls.) Additionally, using an intermediate variable gives you an opportunity to name the value, which can make the intention of the code clearer.
The only different between the two codes, is that in the first you may call twice rules.get(i) if the value is different one one.
So the second version is a little bit faster in general, but you will not feel any difference if the list is not bit.
It depends on the type of the data structure that "rules" object is. If it is a list then yes the second one is much faster as it does not need to search for rules(i) through rules.get(i). If it is a data type that allows you to know immediately rules.get(i) ( like an array) then it is the same..
In general yes it's probably a tiny bit faster (nano seconds I guess), if called the first time. Later on it will be probably be improved by the JIT compiler either way.
But what you are doing is so called premature optimization. Usually should not think about things that only provide a insignificant performance improvement.
What is more important is the readability to maintain the code later on.
You could even do more premature optimization like saving the length in a local variable, which is done by the for each loop internally. But again in 99% of cases it doesn't make sense to do it.
Someone designed code that relied on full data; the XML always had every element. The data source is now sending sparse XML; if it would have been empty before, it's missing now. So, it's time to refactor while fixing bugs.
There's 100+ lines of code like this:
functionDoSomething(foo, bar, getRoot().getChild("1").getChild("A").
getChild("oo").getContent());
Except now, getChild("A") might return null. Or any of the getChild(xxx) methods might.
As one additional twist, instead of getChild(), there are actually four separate methods, which can only occur in certain orders. Someone suggested a varargs call, which isn't a bad idea, but won't work as cleanly as I might like.
What's the quickest way to clean this one up? The best? "try/catch" around every line was suggested, but man, that's ugly. Breaking out the third argument to the above method into it's own function could work... but that would entail 100+ new methods, which feels ugly, albeit less so.
The number of getChild(xxx) calls is somewhere between six and ten per line, with no fixed depth. There's also no possible way to get a correct DTD for this; things will be added later without a prior heads up, and where I'd prefer a warning in the logs when that happens, extra lines in the XML need to be handled gracefully.
Ideas?
getChild() is a convenience method, actually. The cleanest way I have in mind is to have the convenience methods return a valid Child object, but have that "empty" Child's getContent() always return "".
Please consider using XPATH instead of this mess.
What you described (returning a special child object) is a form of the NullObject pattern, which is probably the best solution here.
The solution is to use a DTD file for XML. It validates your XML file so getChild("A") won't return null when A is mandatory.
How about:
private Content getChildContent(Node root, String... path) {
Node target = root;
for ( String pathElement : path ) {
Node child = target.getChild(pathElement);
if ( child == null )
return null; // or whatever you should do
target = child;
}
return target.getContent();
}
to be used as
functionDoSomething(foo, bar, getChildContent(root, "1", "A", "oo"));
Your problem could be a design problem: Law of Demeter.
If not you can use something like an Option type changing the return type of getChild to Option<Node>:
for(Node r : getRoot())
for(Node c1 : r.getChild("1"))
for(Node c2: c1.getChild("A"))
return c2.getChild("oo")
This works because Option implements Iterable it will abort when a return value is not defined. This is similary to Scala where it can be expressed in a single for expression.
One additional advantage is that you can define interfaces that will never return a null value. With an Option type you can state in the interface definition that the return value may be undefined and the client can decide how to handle this.
If it always drills down to roughly the same same level, you can probably refactor the code using Eclipse, for example, and it'll automatically change every line that looks the same.
That way you can modify the method to be smarter, rather than modifying each line individually
This is more a gotcha I wanted to share than a question: when printing with toString(), Java will detect direct cycles in a Collection (where the Collection refers to itself), but not indirect cycles (where a Collection refers to another Collection which refers to the first one - or with more steps).
import java.util.*;
public class ShonkyCycle {
static public void main(String[] args) {
List a = new LinkedList();
a.add(a); // direct cycle
System.out.println(a); // works: [(this Collection)]
List b = new LinkedList();
a.add(b);
b.add(a); // indirect cycle
System.out.println(a); // shonky: causes infinite loop!
}
}
This was a real gotcha for me, because it occurred in debugging code to print out the Collection (I was surprised when it caught a direct cycle, so I assumed incorrectly that they had implemented the check in general). There is a question: why?
The explanation I can think of is that it is very inexpensive to check for a collection that refers to itself, as you only need to store the collection (which you have already), but for longer cycles, you need to store all the collections you encounter, starting from the root. Additionally, you might not be able to tell for sure what the root is, and so you'd have to store every collection in the system - which you do anyway - but you'd also have to do a hash lookup on every collection element. It's very expensive for the relatively rare case of cycles (in most programming). (I think) the only reason it checks for direct cycles is because it so cheap (one reference comparison).
OK... I've kinda answered my own question - but have I missed anything important? Anyone want to add anything?
Clarification: I now realize the problem I saw is specific to printing a Collection (i.e. the toString() method). There's no problem with cycles per se (I use them myself and need to have them); the problem is that Java can't print them. Edit Andrzej Doyle points out it's not just collections, but any object whose toString is called.
Given that it's constrained to this method, here's an algorithm to check for it:
the root is the object that the first toString() is invoked on (to determine this, you need to maintain state on whether a toString is currently in progress or not; so this is inconvenient).
as you traverse each object, you add it to an IdentityHashMap, along with a unique identifier (e.g. an incremented index).
but if this object is already in the Map, write out its identifier instead.
This approach also correctly renders multirefs (a node that is referred to more than once).
The memory cost is the IdentityHashMap (one reference and index per object); the complexity cost is a hash lookup for every node in the directed graph (i.e. each object that is printed).
I think fundamentally it's because while the language tries to stop you from shooting yourself in the foot, it shouldn't really do so in a way that's expensive. So while it's almost free to compare object pointers (e.g. does obj == this) anything beyond that involves invoking methods on the object you're passing in.
And at this point the library code doesn't know anything about the objects you're passing in. For one, the generics implementation doesn't know if they're instances of Collection (or Iterable) themselves, and while it could find this out via instanceof, who's to say whether it's a "collection-like" object that isn't actually a collection, but still contains a deferred circular reference? Secondly, even if it is a collection there's no telling what it's actual implementation and thus behaviour is like. Theoretically one could have a collection containing all the Longs which is going to be used lazily; but since the library doesn't know this it would be hideously expensive to iterate over every entry. Or in fact one could even design a collection with an Iterator that never terminated (though this would be difficult to use in practice because so many constructs/library classes assume that hasNext will eventually return false).
So it basically comes down to an unknown, possibly infinite cost in order to stop you from doing something that might not actually be an issue anyway.
I'd just like to point out that this statement:
when printing with toString(), Java will detect direct cycles in a collection
is misleading.
Java (the JVM, the language itself, etc) is not detecting the self-reference. Rather this is a property of the toString() method/override of java.util.AbstractCollection.
If you were to create your own Collection implementation, the language/platform wouldn't automatically safe you from a self-reference like this - unless you extend AbstractCollection, you would have to make sure you cover this logic yourself.
I might be splitting hairs here but I think this is an important distinction to make. Just because one of the foundation classes in the JDK does something doesn't mean that "Java" as an overall umbrella does it.
Here is the relevant source code in AbstractCollection.toString(), with the key line commented:
public String toString() {
Iterator<E> i = iterator();
if (! i.hasNext())
return "[]";
StringBuilder sb = new StringBuilder();
sb.append('[');
for (;;) {
E e = i.next();
// self-reference check:
sb.append(e == this ? "(this Collection)" : e);
if (! i.hasNext())
return sb.append(']').toString();
sb.append(", ");
}
}
The problem with the algorithm that you propose is that you need to pass the IdentityHashMap to all Collections involved. This is not possible using the published Collection APIs. The Collection interface does not define a toString(IdentityHashMap) method.
I imagine that whoever at Sun put the self reference check into the AbstractCollection.toString() method thought of all of this, and (in conjunction with his colleagues) decided that a "total solution" is over the top. I think that the current design / implementation is correct.
It is not a requirement that Object.toString implementations be bomb-proof.
You are right, you already answered your own question. Checking for longer cycles (especially really long ones like period length 1000) would be too much overhead and is not needed in most cases. If someone wants it, he has to check it himself.
The direct cycle case, however, is easy to check and will occur more often, so it's done by Java.
You can't really detect indirect cycles; it's a typical example of the halting problem.