I'm missing some kind of collection functionality for a specific problem.
I'd like to start with a few informations about the problem's background - maybe there's a more elegant way to solve it, which doesn't end in the specific problem I'm stuck with:
I'm modelling a volume mesh made of tetrahedral cells (the 2D-analog would be a triangle mesh). Two tetrahedrons are considered to be adjacent if they share one triangle-face (which occupies three vertices). My application has to be able to navigate from cell to cell via their common face.
To meet some other requirements I had to split the faces into two so-called half-faces which share the same vertices but are belonging to different cells and have opposite orientation.
The application needs to be able to do calls like this (where Face models a half-face):
Cell getAdjacentCell(Cell cell, int faceIndex) {
Face face = cell.getFace(faceIndex);
Face partnerFace = face.getPartner();
if (partnerFace == null) return null; // no adjacent cell present
Cell adjacentCell = partnerFace.getCell();
return adjacentCell;
}
The implementation of the getPartner()-method is the method in question. My approach is as follows:
Face-objects can create some kind of a immutable Signature-object containing merely the vertex-configuration, the orientation (clockwise (cw) or counter-clockwise (ccw)) and a back-reference to the originating Face-object. Face.Signature-objects are considered to be equal (#Override equals()) if they occupy the same three vertices - regardless of their orientation and their associated cell.
I created two sets in the Mesh-objects to contain all half-faces grouped by their orientation:
Set<Face.Signature> faceSignatureCcw = new HashSet<Face.Signature>();
Set<Face.Signature> faceSignatureCw = new HashSet<Face.Signature>();
Now I'm able to determine if a partner exists ...
class Face {
public Face getPartner() {
if (this.getSignature().isCcw()) {
boolean partnerExists = this.getMesh().faceSignatureCw.contains(this);
} else {
boolean partnerExists = this.getMesh().faceSignatureCcw.contains(this);
}
}
}
... but Set does not allow to retrieve the specific object it contains! It merely confirms that it contains an object that matches via .equals().
(end of background informations)
I need a Collection-concept which provides the following functionality:
add a Face-Object to the Collection (duplicates are prohibited by the application and thus cannot occur)
retrieve the partner from the Collection for a given Face-Object that .equals() but has the opposite orientation
A possible (but way to slow) solution would be:
class PartnerCollection {
List<Face.Signature> faceSignatureCcw = new ArrayList<Face.Signature>();
List<Face.Signature> faceSignatureCw = new ArrayList<Face.Signature>();
void add(Face.Signature faceSignature) {
(faceSignature.isCcw() ? faceSignatureCw : faceSignatureCcw).add(faceSignature);
}
Face.Signature getPartner(Face.Signature faceSignature) {
List<Face.Signature> partnerList = faceSignature.isCcw() ? faceSignatureCw : faceSignatureCcw;
for (Face.Signature partnerSignature : partnerList) {
if (faceSignature.equals(partnerSignature)) return partnerSignature;
}
return null;
}
}
To be complete: The final application will have to handle hundreds of thousands of Face-Objects in a real-time environment. So performance is an issue.
Thanks in advance to anyone who at least tried to follow me up to this point :)
I hope there's anyone out there having the right idea to solve this.
Anything wrong with using two Map<Face.Signature, Face.Signature>?
One for each direction?
That's what I'd do. There's practically no code to it.
It's late night here and I haven't ready your question completely. So, I apologize if this doesn't make any sense, but do have you considered using a graph data structure? If the graph data structure is indeed a possible solution, you might want to check out jGraphT
Have you considered just giving each Face a partner data member? As in,
public class Face
{
Face partner;
//whatever else
}
The Face.Signature construct is a bit hairy and really shouldn't be needed. If every face has a partner (or enough Face objects can have a partner that it makes sense to think that there is a has-a relationship between a Face and a partner Face), the connection should just be an instance variable. If you can use this approach, it should vastly simplify your code. If not, post back the reason this doesn't work for you so that I can keep trying to help.
Using the design you have now, there is no way around something needing to iterate somewhere. The question is, where you want that iteration to occur? I suggest you do this:
List<Face.Signature> partnerList = faceSignature.isCcw() ? faceSignatureCw : faceSignatureCcw;
int idx = partnerList.indexOf(faceSignature);
if(idx == -1)
return null;
return partnerList.get(idx);
Also, as long as you are using Lists, and know that the initial size will have to be pretty big, you might as well say, new ArrayList(100000) or so.
Of course, this isn't the only method, just one that ensures the iteration will be optimal.
EDIT: After some thought, I believe the ideal data-structure for this would be an Octuply Linked List, which can make things confusing, but also very fast (comparatively).
Related
down bellow you can see two example methods, which are structured in the same way, but have to work with completely different integers.
You can guess if the code gets longer, it is pretty anoying to have a second long method which is doing the same.
Do you have any idea, how i can combine those two methods without using "if" or "switch" statements at every spot?
Thanks for your help
public List<> firstTestMethod(){
if(blabla != null){
if(blabla.getChildren().size() > 1){
return blabla.getChildren().subList(2, blabla.getChildren().size());
}
}
return null;
}
And:
public List<> secondTestMethod(){
if(blabla != null){
if(blabla.getChildren().size() > 4){
return blabla.getChildren().subList(0, 2);
}
}
return null;
}
Attempting to isolate common ground from 2 or more places into its own Helper method is not a good idea if you're just looking at what the code does without any context.
The right approach is first to define what you're actually isolating. It's not so much about the how (the fact that these methods look vaguely similar suggests that the how is the same, yes), but the why. What do these methods attempt to accomplish?
Usually, the why is also mostly the same. Rarely, the why is completely different, and the fact that the methods look similar is a pure coincidence.
Here's a key takeaway: If the why is completely different but the methods look somewhat similar, you do not want to turn them into a single method. DRY is a rule of thumb, not a commandment!
Thus, your question isn't directly answerable, because the 2 snippets are so abstractly named (blabla isn't all that informative), it's not possible to determine with the little context the question provides what the why might be.
Thus, answer the why question first, and usually the strategy on making a single method that can cater to both snippets here becomes trivial.
Here is an example answer: If list is 'valid', return the first, or last, X elements inside it. Validity is defined as follows: The list is not null, and contains at least Z entries. Otherwise, return null.
That's still pretty vague, and dangerously close to a 'how', but it sounds like it might describe what you have here.
An even better answer would be: blabla represents a family; determine the subset of children who are eligible for inheriting the property.
The reason you want this is twofold:
It makes it much easier to describe a method. A method that seems to do a few completely unrelated things and is incapable of describing the rhyme or reason of any of it cannot be understood without reading the whole thing through, which takes a long time and is error-prone. A large part of why you want methods in the first place is to let the programmer (the human) abstract ideas away. Instead of remembering what these 45 lines do, all you need to remember is 'fetch the eligible kids'.
Code changes over time. Bugs are found and need fixing. External influences change around you (APIs change, libraries change, standards change). Feature requests are a thing. Without the why part it is likely that one of the callers of this method grows needs that this method cannot provide, and then the 'easiest' (but not best!) solution is to just add the functionality to this method. The method will eventually grow into a 20 page monstrosity doing completely unrelated things, and having 50 parameters. To guard against this growth, define what the purpose of this method is in a way that is unlikely to spiral into 'read this book to understand what all this method is supposed to do'.
Thus, your question is not really answerable, as the 2 snippets do not make it obvious what the common thread might be, here.
Why do these methods abuse null? You seem to think null means empty list. It does not. Empty list means empty list. Shouldn't this be returning e.g. List.of instead of null? Once you fix that up, this method appears to simply be: "Give me a sublist consisting of everything except the first two elements. If the list is smaller than that or null, return an empty list", which is starting to move away from the 'how' and slowly towards a 'what' and 'why'. There are only 2 parameters to this generalized concept: The list, and the # of items from the start that need to be omitted.
The second snippet, on the other hand, makes no sense. Why return the first 3 elements, but only if the list has 5 or more items in it? What's the link between 3 and 5? If the answer is: "Nothing, it's a parameter", then this conundrum has far more parameters than the first snippet, and we see that whilst the code looks perhaps similar, once you start describing the why/what instead of the how, these two jobs aren't similar at all, and trying to shoehorn these 2 unrelated jobs into a single method is just going to lead to bad code now, and worse code later on as changes occur.
Let's say instead that this last snippet is trying to return all elements except the X elements at the end, returning an empty list if there are fewer than X. This matches much better with the first snippet (which does the same thing, except replace 'at the end' with 'at the start'). Then you could write:
// document somewhere that `blabla.getChildren()` is guaranteed to be sorted by age.
/** Returns the {#code numEldest} children. */
public List<Child> getEldest(int numEldest) {
if (numEldest < 0) throw new IllegalArgumentException();
return getChildren(numEldest, true);
}
/** Returns all children except the {#code numEldest} ones. */
public List<Child> getAllButEldest(int numEldest) {
if (numEldest < 0) throw new IllegalArgumentException();
return getChildren(numEldest, false);
}
private List<Child> getChildren(int numEldest, boolean include) {
if (blabla == null) return List.of();
List<Child> children = blabla.getChildren();
if (numEldest >= children.size()) return include ? children : List.of();
int startIdx = include ? 0 : numEldest;
int endIdx = include ? numEldest : children.size();
return children.subList(startIdx, endIdx);
}
Note a few stylistic tricks here:
boolean parameters are bad, because why would you know 'true' matches up with 'I want the eldest' and 'false' matches up with 'I want the youngest'? Names are good. This snippet has 2 methods that make very clear what they do, by using names.
That 'when extracting common ground, define the why, not the how' is a hierarchical idea - apply it all the way down, and as you get further away from the thousand-mile view, the what and how become more and more technical. That's okay. The more down to the details you get, the more private things should be.
By having defined what this all actually means, note that the behaviour is subtly different: If you ask for the 5 eldest children and there are only 4 children, this returns those 4 children instead of null. That shows off some of the power of defining the 'why': Now it's a consistent idea. Returning all 4 when you ask for 'give me the 5 eldest', is no doubt 90%+ of all those who get near this code would assume happens.
Preconditions, such as what comprises sane inputs, should always be checked. Here, we check if the numEldest param is negative and just crash out, as that makes no sense. Checks should be as early as they can reasonably be made: That way the stack traces are more useful.
You can pass objects that encapsulate the desired behavior differences at various points in your method. Often you can use a predefined interface for behavior encapsulation (Runnable, Callable, Predicate, etc.) or you may need to define your own.
public List<> testMethod(Predicate<BlaBlaType> test,
Function<BlaBlaType, List<>> extractor)
{
if(blabla != null){
if(test.test(blabla)){
return extractor.apply(blabla);
}
}
return null;
}
You could then call it with a couple of lambdas:
testMethod(
blabla -> blabla.getChildren().size() > 1,
blabla -> blabla.getChildren().subList(2, blabla.getChildren().size())
);
testMethod(
blabla -> blabla.getChildren().size() > 4,
blabla -> blabla.getChildren().subList(0, 2)
);
Here is one approach. Pass a named boolean to indicate which version you want. This also allows the list of children to be retrieved independent of the return. For lack of more meaningful names I choose START and END to indicate which parts of the list to return.
static boolean START = true;
static boolean END = false;
public List<Children> TestMethod(boolean type) {
if (blabla != null) {
List<Children> list = blabla.getChildren();
int size = list.size();
return START ?
(size > 1 ? list.subList(0, 2) : null) :
(size > 4 ? list.subList(2, size) :
null);
}
return null;
}
My question is about scalable logic branching.
Is there an elegant way to do branching logic trees in java (although I've always thought that they look more like root systems, but that's beside the point). I'm trying to develop a very simple text based adventure game as a side project to my studies, but I'm not sure what the best way to go about navigating these large logic systems is.
What I'm trying currently is an array that holds four values: stage, location, step, choice.
[EDIT - added choice variable to store user choice, changed name to reflect actual name in my code so that I don't get confused later]
int[] decisionPoint = {stage, location, step, choice};
A stage is supposed to represent a single major part of the tree.
A location is supposed to represent my location within the tree.
A step is supposed to represent my progress through a given location.
Choice is the user input
At the moment, since I'm only dealing with a single tree, stage isn't being used much. Location and step are working well, but any time I get into a decision within a step the system breaks down.
I could keep creating more and more variables to represent deeper and deeper layers into the tree, but I feel like Java probably provides a better solution somewhere.
Currently, I'm using switch statements to figure out where in the program I am based on the values stored in nextQuestion. Is there something better? Or, is there a way to extend the array beyond what I'm using here to make it a bit more polymorphic (in the methods for the individual questions/text/whatever could I just have it create a larger array from a smaller one? Could I pass a smaller array as an argument but define the parameter as a larger array?)
//Switch example
switch(LocationTracker.getLocation()) { //start location finding switch
case 1 : //Location 1
switch (LocationTracker.getStep()) {//start location1 switch
case 1 :
location1s1(graphicsStuff);
break;
case 2 :
location1s2(graphicsStuff);
break;
} break; //end location1 switch
case 2 : //Location 2
switch (LocationTracker.getStep()) {
//same stuff that happened above
} break;
Everything I find online just brings me to irrelevant pages about different online survey creators that I can use. If I could view their source-code that'd be kind of nice, but since I can't, I'm hoping you guys can help. :)
[EDIT]
Wow, what a nice response in such a short time at such an early hour!
I'll try to go into very explicit detail about how I'm solving the problem right now. It's worth mentioning that this does technically work, it's just that every time I need a branch inside a branch I have to create another variable inside a string array to keep track of my position, and really I'm just fishing for a solution that doesn't need an infinitely expanding string as the program becomes more and more complex.
Right now I have a program with 5 classes:
The Main Class which starts the GUI
The GUI class which provides three services: userInput, userOptions, and outputArea.
The DecisionTreeStage1 class which handles the logic of my problem at the moment (using switch statements).
The LocationTracker class which is designed to track my location within the DecisionTreeStage1 class
The DialogueToOutput class which changes the options that the users have, and also updates the output fields with the results of their actions.
Special point of interest:
I want to have multiple decision branches at some point, and one main tree (maybe call it Yggdrasil? :D). For now, the DecisionTreeStage1 represents a very isolated system that isn't planning to go anywhere. I hope to use the stage variable stored in my array to move from one major branch to the next (climbing the tree if you will). My current implementation for this just uses nested switch statements to decide where I'm going. This imposes an annoying limitation: every time my path gets deeper and deeper I need another variable in my array to store that data. For example:
//Switch example deeper
switch(LocationTracker.getLocation()) { //start location finding switch
case 1 : //Location 1
switch (LocationTracker.getStep()) {//start location1 switch
case 1 :
switch(LocationTracker.getChoice()) {//Make a decision at this step based on the user choice
Given this example, what if the user choice doesn't just lead to some logic? (In this case, just an update to the outputArea) What if it leads to ANOTHER branching path? And that leads to another branching path? Eventually, I would want all paths to converge on the same spot so that I could move to the next "stage."
My real hope is to make this problem infinitely scalable. I want to be able to go as deep into one branch as I need to, without having to create a static and arbitrary number of variable declarations in my decisionPoint array every time.
I haven't been able to find much information about this, like I said.
Let me try presenting this question: Are there any other branching logic statements other than:
if(something)
stuff;
else
otherStuff;
and
switch(something) {
case 1:
stuff;
break;
case 2:
otherStuff;
break;
And if so, what are they?
PS - I know about the ternary if statement in Java, but it doesn't seem useful to what I'm doing. :)
You can build normal tree structures in Java, similar to the trees that can be built in C. Regardless if object references are theoretically pointers or not, they substitute pointers nicely in the tree constructions:
class Node {
Node left;
Node right;
Node parent;
}
You can also build graphs (cyclic graphs including) and linked lists no problem. There is no any obvious reason why large structures should have problems (apart from that object reference uses some memory).
Instead of returning a value, you could return an Callable which just needs to be executed. This can then be chained (theoretically infinitely)
You can have a LocationEvaluation for example which could return a SpecificLocationEvaluator which in turns returns one of StepEvaluation or ChoiceEvaluator or somesuch. All of these would implement the Callable interface.
Depending on how you do it, you could have strict type checking so that a LocationEvaluation always returns a SpecificLocationEvaluator or it can generic and then you can chain any of then in any order.
Once you build the structure, out, you would essentially have a tree which would be traversed to solve it.
I don't understand the problem adequately to be able to provide more concrete implementation details - and apologies if I misunderstood some of the branching (i.e. the names of the classes / steps above)
I'm using lists on a jsf web server to e.g. access the data model from web pages. The access to these lists is done from various other places as well though (web services, tools).
There is a piece of code which gets broken by someone resorting the list I return. With someone I am talking about someone in my developer team - we are the only ones using this code. I have roughly 300 references on this function and it could be performance relevant to do the fix nicely:
The list can be anywhere from 1 - 10'000 entries and commonly I will have maybe 10-100 such lists. In reality I will probably have very often around 20 lists with each having 8 entries - so not such a big deal. But I can have more sometimes
I am btw talking about a function something like this:
public List<MyObject> getMyObjectList() {
if (this.myObjects== null) {
myObjects = new ArrayList<MyObject>(myObjectsMap.values());
}
return myObjects;
}
Now I could of course do the return like this:
public List<MyObject> getMyObjectList() {
if (this.myObjects== null) {
myObjects = new ArrayList<MyObject>(myObjectsMap.values());
}
return Collections.unmodifiableList(myObjects );
}
But this will break, eventually at several places in different projects / applications.
It would imho be the cleanest to return unmodifiable, add javadoc - and fix everything which breaks. But :-D this is work. I'd probably have to test roughly 10 applications.
On the other hand I could just return a new list, e.g.
public List<MyObject> getMyObjectList() {
return new ArrayList<MyObject>(myObjectsMap.values());
}
Which is no to little work - but what about the performance issues with this? Other than that - if someone is deleting stuff from the list I returned, it will silently break the application.
So:
What is the performance issue? Is it an issue?
What would you do?
What would you do?
If I understand you correctly, this is a production library used in several applications. And, like it or not, the de-facto contract for getMyObjectList() is that the user can sort the list without getting an error or exception.
I would immediately change this method and return a defensive copy:
// good idea
public List<MyObject> getMyObjectList() {
return new ArrayList<MyObject>(myObjectsMap.values());
}
You have now fixed the problem where someone is sorting your internal collection and you have not broken your contract. In fact you can even update the Javadoc and tell the user that they can do whatever they want with the copy.
This may or may not cause a performance problem. Remember, the objects in the collection are not being copied - they are still being shared. You are just creating a new array list and whatever internal objects it needs to keep track of the objects.
If it turns out that these copies are causing a performance problem, then you can consider enhancing your class to include a read-only cache of your internal collection. To access this you must give the method a new name - ex getMySharedObjectList and you can gradually update client code to use this new method as performance needs require.
But don't do it this way. I think this method is particularly bad:
// bad idea
public List<MyObject> getMyObjectList() {
if (this.myObjects== null) {
myObjects = new ArrayList<MyObject>(myObjectsMap.values());
}
return Collections.unmodifiableList(myObjects );
}
You have created a situation where it is very easy for myObjects to get out of sync with myObjectsMap. (What happens when an item is added to myObjectsMap after someone called getMyObjectList?) At the same time, you are making a copy of the list every time someone calls the method. So you just gave up whatever theoretical performance gains you had in the first place.
Anyway, good luck. Hope this helps.
If you can afford to test the applications, I'd go with the unmodifiableList. It will save you from other related issues in the future.
I am often in a situation where I have a method where something can go wrong but an exception would not be right to use because it is not exceptional.
For example:
I am designing a monopoly game. The class Bank has a method buyHouse and a field which counts the number of houses left(there is 32 houses in monopoly). Something that could go wrong is a player buying a house when there is 0 left. How should I handle this. Here is 3 approaches I can come up with.
1. public void buyHouse(Player player, PropertyValue propertyValue)
{
if(houseCount < 0) throw new someException;
....
//Not really an exceptional situation
}
2. public boolean buyHouse(Player player, PropertyValue propertyValue)
{
if(houseCount < 0) return false;
....
//This I think is the most normal approach but changing something
//and returning if it was a success seems bad practice to me.
}
3. public boolean housesLeft()
{
if(houseCount > 0) return true;
return false;
//Introducing a new method. But now I expect the client to call this method
//first before calling buyHouse().
}
What would you do?
I would do 3 and 1 together. The proper usage of the API is to check if there are houses left before buying one. If, however, the developer forgot to do so, then throw a runtime exception.
If it is a multi-threaded situation (where many people are buying a house simultaneously) it gets more complicated. In that case, I would indeed consider a checked exception, if not a tryToBuyAHouse method that returns a boolean, but a runtime exception on the buyHouse method.
This is very similar to the idea of popping an item off of an empty stack... it is exceptional. You are doing something that should fail.
Think of exceptional situations as cases where you want to notify the programmer that something has gone wrong and you do not want them to ignore it. Using a simple boolean return value isn't "right" since the programmer can just ignore it. Also the idea of having a method that should be called to check that there are houses available is a good idea. But remember that programmers will, in some cases, forget to call it. In that case the exception serves to remind them that they need to call the method to check that a house exists before acquiring it.
So, I would provide the method to check that there are houses, and expect that people will call it and use the true/false return value. In the event that they do not call that method, or ignore the return value, I would throw an exception so that the game does not get put into a bad state.
I find the meaning of "exceptional" to be quite subjective. It means whatever you want it to mean. You're designing the interface to the function, you get to decide what is exceptional and what is not.
If you don't expect buyHouse to be invoked when houseCount is <= 0, then an exception here is fine. Even if you do expect it to be invoked, you can catch the exception in the caller to handle that situation.
If something works as expected 32 times in a row, and then fails to function as expected, I think that you could justify making it an exceptional condition if it was an isolated case.
Given the situation you describe, I think that using exceptions is not appropriate, as once 32 houses are sold the bank will continue to be out of them (this is the new "normal" state), and exception processing is actually pretty slow in Java compared to normal processing.
One thing you could do is more closely mirror the actual interaction. In Monopoly, the banker will simply tell you that you cannot buy a house if there are none left.
A potential model for this is as follows:
public House buy(Player player, PropertyValue propertyValue) {
House propertyHouse = null;
if (houseCount > 0) {
propertyHouse = new House(player, propertyValue);
houseCount--;
}
return propertyHouse;
}
That would also allow you to add behavior to the House object, and make the flow for requesting/buying a house a little more natural. If there are no houses available, you don't get one.
Either (1) or (2) is acceptable, depending on whether or not you consider "no houses to buy" a routine result or an exceptional condition.
(3) is a bad idea, for several reasons:
it breaks encapsulation (client has to know too much about internals of the Bank)
you'd still have to check for errors and do (1) or (2) in case the client screws up
it's problematic in multi-threaded situations
A couple of other options:
Your method could accept a number of houses requested parameter, and return the number of houses actually purchased, after checking the player's available balance and the number of houses available. Returning zero would be a perfectly acceptable possibility. You rely on calling code to check how many houses it actually got back, of course. (this is a variation on returning a boolean, where true/false indicate 1 or zero houses purchased, of course)
A variation on that theme would be to return a collection of House objects corresponding to the number of houses successfully purchased, which could be an empty collection. Presumably calling code would then be unable to act as if it had more House objects than you'd given it. (this is a variation on returning a House object, with null representing no houses purchased, and an object representing 1 house, and is often part of a general coding approach of preferring empty collections to null references)
Your method could return a HousePurchaseTransaction object which was itself queryable to determine the success or failure of the transaction, its actual cost, and so on.
A richer variation on that theme might be to make HousePurchaseTransaction abstract and derive two child classes: SuccessfulHousePurchase and FailedHousePurchase, so you could associate different behaviour with the two result conditions. Installing a house into a street might require you to pass a 'SuccessfulHousePurchase' object in order to proceed. (this avoids the danger of returning a null being the root of a later null reference error, and is a variant on the null object pattern)
In reality, I suspect the approach taken would depend on where you ended up allocating responsibility for placing the houses on the board, upgrading to hotels, enforcing even-build rules, limiting the number of houses purchased on a given street, and so on.
I would do something like this:
public boolean BuyHouse(Player player, PropertyValue propertyValue) {
// Get houseCount
if(houseCount <= 0) {
// Log this to your message queue that you want to show
// to the user (if it has a UI)
return false;
}
// Do other stuff if houses are left
}
PS: I am not familiar with java, I use C#
This question is hard to answer without the context of which entity has-a house. From a general design perspective, there is little semantic difference for the caller between (1) and (2) - both are try and check - but you are correct that (1) is to be shunned for wholly expectable state.
You decide a rule & exception here for user who use your API/methods:
housesLeft() can be called to check
the number of houses left before
buyHouse() is called. Calling
buyHouse() whenever the number of
house left is zero is an exception.
It is similar to checking before you access certain array element, you check the array length before you try t o access it, else an exception will be issue.
So it should looks like this:
if (housesLeft() > 0) buyHouse(...);
similar to
for (int i=0; i < arrayList.length; i++) System.out.println(arrayList[i]);
Remember that you can use
return houseCount > 0;
rather than
if(houseCount > 0) return true;
return false;
I'm in my first programming class in high school. We're doing our end of the first semester project.
This project only involves one class, but many methods. My question is about best practice with instance variables and local variables. It seems that it would be much easier for me to code using almost only instance variables. But I'm not sure if this is how I should be doing it or if I should be using local variables more (I would just have to have methods take in the values of local variables a lot more).
My reasoning for this is also because a lot of times I'll want to have a method return two or three values, but this is of course not possible. Thus it just seems easier to simply use instance variables and never having to worry since they are universal in the class.
I haven't seen anyone discuss this so I'll throw in more food for thought. The short answer/advice is don't use instance variables over local variables just because you think they are easier to return values. You are going to make working with your code very very hard if you don't use local variables and instance variables appropriately. You will produce some serious bugs that are really hard to track down. If you want to understand what I mean by serious bugs, and what that might look like read on.
Let's try and use only instance variables as you suggest to write to functions. I'll create a very simple class:
public class BadIdea {
public Enum Color { GREEN, RED, BLUE, PURPLE };
public Color[] map = new Colors[] {
Color.GREEN,
Color.GREEN,
Color.RED,
Color.BLUE,
Color.PURPLE,
Color.RED,
Color.PURPLE };
List<Integer> indexes = new ArrayList<Integer>();
public int counter = 0;
public int index = 0;
public void findColor( Color value ) {
indexes.clear();
for( index = 0; index < map.length; index++ ) {
if( map[index] == value ) {
indexes.add( index );
counter++;
}
}
}
public void findOppositeColors( Color value ) {
indexes.clear();
for( index = 0; i < index < map.length; index++ ) {
if( map[index] != value ) {
indexes.add( index );
counter++;
}
}
}
}
This is a silly program I know, but we can use it to illustrate the concept that using instance variables for things like this is a tremendously bad idea. The biggest thing you'll find is that those methods use all of the instance variables we have. And it modifies indexes, counter, and index every time they are called. The first problem you'll find is that calling those methods one after the other can modify the answers from prior runs. So for example, if you wrote the following code:
BadIdea idea = new BadIdea();
idea.findColor( Color.RED );
idea.findColor( Color.GREEN ); // whoops we just lost the results from finding all Color.RED
Since findColor uses instance variables to track returned values we can only return one result at a time. Let's try and save off a reference to those results before we call it again:
BadIdea idea = new BadIdea();
idea.findColor( Color.RED );
List<Integer> redPositions = idea.indexes;
int redCount = idea.counter;
idea.findColor( Color.GREEN ); // this causes red positions to be lost! (i.e. idea.indexes.clear()
List<Integer> greenPositions = idea.indexes;
int greenCount = idea.counter;
In this second example we saved the red positions on the 3rd line, but same thing happened!?Why did we lose them?! Because idea.indexes was cleared instead of allocated so there can only be one answer used at a time. You have to completely finish using that result before calling it again. Once you call a method again the results are cleared and you lose everything. In order to fix this you'll have to allocate a new result each time so red and green answers are separate. So let's clone our answers to create new copies of things:
BadIdea idea = new BadIdea();
idea.findColor( Color.RED );
List<Integer> redPositions = idea.indexes.clone();
int redCount = idea.counter;
idea.findColor( Color.GREEN );
List<Integer> greenPositions = idea.indexes.clone();
int greenCount = idea.counter;
Ok finally we have two separate results. The results of red and green are now separate. But, we had to know a lot about how BadIdea operated internally before the program worked didn't we? We need to remember to clone the returns every time we called it to safely make sure our results didn't get clobbered. Why is the caller forced to remember these details? Wouldn't it be easier if we didn't have to do that?
Also notice that the caller has to use local variables to remember the results so while you didn't use local variables in the methods of BadIdea the caller has to use them to remember results. So what did you really accomplish? You really just moved the problem to the caller forcing them to do more. And the work you pushed onto the caller is not an easy rule to follow because there are some many exceptions to the rule.
Now let's try doing that with two different methods. Notice how I've been "smart" and I reused those same instance variables to "save memory" and kept the code compact. ;-)
BadIdea idea = new BadIdea();
idea.findColor( Color.RED );
List<Integer> redPositions = idea.indexes;
int redCount = idea.counter;
idea.findOppositeColors( Color.RED ); // this causes red positions to be lost again!!
List<Integer> greenPositions = idea.indexes;
int greenCount = idea.counter;
Same thing happened! Damn but I was being so "smart" and saving memory and the code uses less resources!!! This is the real peril of using instance variables like this is calling methods is order dependent now. If I change the order of the method calls the results are different even though I haven't really changed the underlying state of BadIdea. I didn't change the contents of the map. Why does the program yield different results when I call the methods in different order?
idea.findColor( Color.RED )
idea.findOppositeColors( Color.RED )
Produces a different result than if I swapped those two methods:
idea.findOppositeColors( Color.RED )
idea.findColor( Color.RED )
These types of errors are really hard to track down especially when those lines aren't right next to each other. You can completely break your program by just adding a new call in anywhere between those two lines and get wildly different results. Sure when we're dealing with small number of lines it's easy to spot errors. But, in a larger program you can waste days trying to reproduce them even though the data in the program hasn't changed.
And this only looks at single threaded problems. If BadIdea was being used in a multi-threaded situation the errors can get really bizarre. What happens if findColors() and findOppositeColors() is called at the same time? Crash, all your hair falls out, Death, space and time collapse into a singularity and the universe is swallows up? Probably at least two of those. Threads are probably above your head now, but hopefully we can steer you away from doing bad things now so when you do get to threads those bad practices don't cause you real heartache.
Did you notice how careful you had to be when calling the methods? They overwrote each other, they shared memory possibly randomly, you had to remember the details of how it worked on the inside to make it work on the outside, changing the order in which things were called produce very big changes in the next lines down, and it only could only work in a single thread situation. Doing things like this will produce really brittle code that seems to fall apart whenever you touch it. These practices I showed contributed directly to the code being brittle.
While this might look like encapsulation it is the exact opposite because the technical details of how you wrote it have to be known to the caller. The caller has to write their code in a very particular way to make their code work, and they can't do it without knowing about the technical details of your code. This is often called a Leaky Abstraction because the class is suppose to hide the technical details behind an abstraction/interface, but the technical details leak out forcing the caller to change their behavior. Every solution has some degree of leaky-ness, but using any of the above techniques like these guarantees no matter what problem you are trying to solve it will be terribly leaky if you apply them. So let's look at the GoodIdea now.
Let's rewrite using local variables:
public class GoodIdea {
...
public List<Integer> findColor( Color value ) {
List<Integer> results = new ArrayList<Integer>();
for( int i = 0; i < map.length; i++ ) {
if( map[index] == value ) {
results.add( i );
}
}
return results;
}
public List<Integer> findOppositeColors( Color value ) {
List<Integer> results = new ArrayList<Integer>();
for( int i = 0; i < map.length; i++ ) {
if( map[index] != value ) {
results.add( i );
}
}
return results;
}
}
This fixes every problem we discussed above. I know I'm not keeping track of counter or returning it, but if I did I can create a new class and return that instead of List. Sometimes I use the following object to return multiple results quickly:
public class Pair<K,T> {
public K first;
public T second;
public Pair( K first, T second ) {
this.first = first;
this.second = second;
}
}
Long answer, but a very important topic.
Use instance variables when it's a core concept of your class. If you're iterating, recursing or doing some processing, then use local variables.
When you need to use two (or more) variables in the same places, it's time to create a new class with those attributes (and appropriate means to set them). This will make your code cleaner and help you think about problems (each class is a new term in your vocabulary).
One variable may be made a class when it is a core concept. For example real-world identifiers: these could be represented as Strings, but often, if you encapsulate them into their own object they suddenly start "attracting" functionality (validation, association to other objects, etc.)
Also (not entirely related) is object consistency - an object is able to ensure that its state makes sense. Setting one property may alter another. It also makes it far easier to alter your program to be thread-safe later (if required).
Local variables internal to methods are always prefered, since you want to keep each variable's scope as small as possible. But if more than one method needs to access a variable, then it's going to have to be an instance variable.
Local variables are more like intermediate values used to reach a result or compute something on the fly. Instance variables are more like attributes of a class, like your age or name.
The easy way: if the variable must be shared by more than one method, use instance variable, otherwise use local variable.
However, the good practice is to use as more local variables as possible. Why? For your simple project with only one class, there is no difference. For a project that includes a lot of classes, there is big difference. The instance variable indicates the state of your class. The more instance variables in your class, the more states this class can have and then, the more complex this class is, the hard the class is maintained or the more error prone your project might be. So the good practice is to use as more local variable as possible to keep the state of the class as simple as possible.
Short story: if and only if a variable needs to be accessed by more than one method (or outside of the class), create it as an instance variables. If you need it only locally, in a single method, it has to be a local variable.
Instance variables are more costly than local variables.
Keep in mind: instance variables are initialized to default values while local variables are not.
Declare variables to be scoped as narrowly as possible. Declare local variables first. If this isn't sufficient, use instance variables. If this isn't sufficient, use class (static) variables.
I you need to return more than one value return a composite structure, like an array or an object.
Try to think about your problem in terms of objects. Each class represents a different type of object. Instance variables are the pieces of data that a class needs to remember in order to work, either with itself or with other objects. Local variables should just be used intermediate calculations, data that you don't need to save once you leave the method.
Try not to return more than one value from your methods in first place. If you can't, and in some cases you really can't, then I would recommend encapsulating that in a class. Just in last case I would recommend changing another variable inside your class (an instance variable). The problem with the instance variables approach is that it increases side effects - for example, you call method A in your program and it modifies some instance(s) variable(s). Over time, that leads to increased complexity in your code and maintenance becomes harder and harder.
When I have to use instance variables, I try to make then final and initialize then in the class constructors, so side effects are minimized. This programming style (minimizing the state changes in your application) should lead to better code that is easier to maintain.
Generally variables should have minimal scope.
Unfortunately, in order to build classes with minimized variable scope, one often needs to do a lot of method parameter passing.
But if you follow that advice all the time, perfectly minimizing variable scope, you
may end up with a lot of redundancy and method inflexibility with all the required objects passed in and out of methods.
Picture a code base with thousands of methods like this:
private ClassThatHoldsReturnInfo foo(OneReallyBigClassThatHoldsCertainThings big,
AnotherClassThatDoesLittle little) {
LocalClassObjectJustUsedHere here;
...
}
private ClassThatHoldsReturnInfo bar(OneMediumSizedClassThatHoldsCertainThings medium,
AnotherClassThatDoesLittle little) {
...
}
And, on the other hand, imagine a code base with lots of instance variables like this:
private OneReallyBigClassThatHoldsCertainThings big;
private OneMediumSizedClassThatHoldsCertainThings medium;
private AnotherClassThatDoesLittle little;
private ClassThatHoldsReturnInfo ret;
private void foo() {
LocalClassObjectJustUsedHere here;
....
}
private void bar() {
....
}
As code increases, the first way may minimize variable scope best, but can easily lead to a lot of method parameters being passed around. The code will usually be more verbose and this can lead to a complexity as one refactors all these methods.
Using more instance variables can reduce the complexity of lots of method parameters being passed around and can give a flexibility to methods when you are frequently reorganizing methods for clarity. But it creates more object state that you have to maintain. Generally the advice is to do the former and refrain from the latter.
However, very often, and it may depend on the person, one can more easily manage state complexity compared with the thousands of extra object references of the first case. One may notice this when business logic within methods increases and organization needs to change to keep order and clarity.
Not only that. When you reorganize your methods to keep clarity and make lots of method parameter changes in the process, you end up with lots of version control diffs which is not so good for stable production quality code. There is a balance. One way causes one kind of complexity. The other way causes another kind of complexity.
Use the way that works best for you. You will find that balance over time.
I think this young programmer has some insightful first impressions for low maintenance code.
Use instance variables when
If two functions in the class need the same value, then make it an instance variable
or
If the state is not expected to change, make it an instance variable. For example: immutable object, DTO, LinkedList, those with final variables
or
If it is an underlying data on whom actions are performed. For example: final in arr[] in the PriorityQueue.java source code file
or
Even if it is used only once and state is expected to change, make it an instance if it is used only once by a function whose parameter list should be empty. For example: HTTPCookie.java Line: 860 hashcode() function uses 'path variable'.
Similarly, use a local variable when none of these conditions match, specifically if the role of the variable would end after the stack is popped off. For example: Comparator.compare(o1, o2);