OO Design Issue with IDs - java

Say as an example I have a Player class that has a race via a Race class. These races are fixed in number and are loaded into an array which can be accessed statically.
My question is whether the Player class should have an index ID number which would then need to call the static function getRaceByID(int) to retrieve the Race class to do some internal calculations. Now I could get around having to do this if I was to have the race reference directly in the Player class, but then saving the player to a file becomes problematic. I only want a reference to the Race be stored along with the Player data. Like an ID.
I want to avoid storing a copy of the Race data and instead just reference it. Is there anything I should be doing differently? Are there any patterns to address something like this? Databases deal with IDs, but it doesn't seem to work very well in OO development. Any help is appreciated, thanks.
class Player
{
Race race;
}
In this case I would need to compare this race to the races in my static array so that I can properly write out the index ID. Another solution is to store the ID in the Race class itself so that I can reference it directly from the Race class like so:
race.getID();
Or would it be better to go with something like this to enforce this relationship:
class Player
{
int raceID;
}
Race r = MyFile.getRaceByID(raceID);
// can now use race

What you have in memory does not have to be what you store in a database.
The details will depend upon the language you're using and the Object-to-Database technology.
If you have
Player {
Race myRace;
// etc
}
This does not necesserily imply that you have a copy of the Race, in some languages this would imply a "reference" or "pointer" to a Race.
When you come to store in the database it would be quite normal for just the Id of the race to be stored.
In other words, you don't need to compromise the OO design to achieve the effect you want.

Of some relevance to using IDs in an OO design: Modern C++ Design by Alexandrescu has an excellent chapter on object factories. If you have a big switch statement on your ID, then you could probably benefit from reading this chapter, as it will show you the OO way to handle that sort of thing. As the book says:
The Shape-Drawing example [of
polymorphism] is often
encountered in C++ books, including
Bjarne Stroustrup's classic
(Stroustrup 1997). However, most
introductory C++ books stop when it
comes to loading graphics from a file,
exactly because the nice model of
having separate drawing objects
breaks.... A straightforward implementation
is to require each Shape-derived
object to save an integral identifier
at the very beginning. Each object
should have its own unique ID. Then
reading a file would look like this:
[code with big switch]
.... The only
problem [with this type of code] is that it
breaks the most important rules of
object orientation: [E.G.,] It
collects in a single source file
knowledge about all Shape-derived
classes in the program....

There's no reason that your ID based concept won't work. It doesn't violate any OO principles, and has several benefits. I see no reason NOT to go that route, especially if you've already determined that it would work well for you.
As an aside, if you want to avoid having static_races[player.race_id] scattered throughout your code, a simple wrapper function would suffice in maintaining a more "OO feel" (Psudocode, since you haven't stated a language:
function Race Player::GetRace() {
return static_races[this.race_id];
}
Simple, but effective. No need to over complicate things.

Related

What is the most appropriate type of variable to use when coding a game's location? Boolean/String/Some other? (java)

First post so I hope this is an appropriate type question for this site. If not I'd appreciate it if someone could direct me to a more appropriate place. I'm extremely new at programming. I did a bit in high school and have recently decided to relearn starting with making a text-based survival game in Java7 using Eclipse.
Right now I'm coding the location superclass. The particular function I need help with is this: it needs to be able to keep track of which of 9 regions the user currently "is in" (which is then used in a large number of other classes for many various purposes. The location class also includes functionality for accepting user input to move to a new region, among various other things.) The way I started this was by making a boolean variable for each region and whenever a transition should occur that variable is set to true. But now I'm wondering if this is the most efficient way to do this. I have to take String inputs, run a method to standardize various acceptable answers into one, and then run it through a switch statement that makes the corresponding boolean variable true?
Would it be simpler to simply keep track of the location with a single String variable that gets set as whatever region the player is in? Or would that be more likely to cause errors or complications when coding? Would an array better suit this need? edit: (I just want to thank you guys for people such an open and helpful community. Its really appreciated.)
BIG EDIT: I wanted to further elaborate on what the regions will eventually do. In each region there will eventually be a handful of places the user can go to that are generic with a small number of places unique to each location. Other major superclasses would be altered depending on what region the user is in (example: my "encounters" superclass would have variables that dictate how likely certain encounters are to happen (i.e. chance to a hostile attack) and these variables would be altered depending on the region) but also by other instances (The "Time" superclass would keep track of the day and time of day which would also effect the variables in "encounters".) The current plan was to make a class for each generic place (i.e. Walmart, technology store, grocery, public park, etc.) They would contain different properties depending on the region and would also effect classes like "encounters". I was going to have their properties defined by if/else & switch statements depending on what region the user was in. But now I'm realizing it would make more sense to define their properties when I create the object.
While a lot of people are steering me to enums, some are also suggesting I make classes for each region, (and I am also hearing about interfaces.) If I were to go with the 2nd route I have 3 questions: (a) If the region classes were all subclasses to "Location", then wouldn't I have a problem creating objects for all the generic places inside the region classes (i.e. Walmarts) because the Walmart class can only belong to one superclass? (If not what is the difference between an object being created in a class and the actual relationship between a superclass and its subclasses) (b) If I initialized each region as an object instead of simply recording it with a variable, how would I achieve the original task of remembering which region the user is in (for functions as simple as printing the region out to making alterations to variables in classes like "encounters"). Wouldn't I still need to have some sort of variable to identify the region? And if so, that what practical purpose does creating classes for the region accomplish? (I can see this might still let me make the code cleaner by housing the variables that interact with "encounters" instead of having to use if/else/switch statements inside the "encounters" class (also in this case how could I make the variables in the region classes interact with the variables in "encounters" since neither belong to each other) but anything else?) (c) Would it make more sense to create classes for every region or a single region class that gets defined differently when initialized and WHY?
Finally, I know I may have asked too many questions but could someone please explain to me the different utilities found in enums and interfaces (I'm especially interested in hearing about enums) and now that you know a little bit more, should I be using enums, interfaces, or some sort of classes for the regions? Thank you guys so much!
Enum is very recommended, as stated by Vasily Liaskovsky.
Using int is a great way as well. For example:
int currentRegion;
static final int region1 = 0;
static final int region2 = 1;
static final int region3 = 2;
etc...
Make sure the region1 etc are stated final, so their IDs cannot be changed afterwards, static reference could save memory if you're using multiple location superclass objects, also easier accessible outside the class.
This way to check if you're in a certain region, just use a if statement:
if(currentRegion == region1) {}
To set it:
currentRegion = region1;
Simple as that
I disagree with the usage of an enum here. An enum is great, but not extendable. What if you want to add another region?
So just create classes, and pass them around. They might hold some form of string as identifier (but you should load the proper name from a file that can be localized, anyways).
With a proper class, you can easily add new transitions betwen regions (make your region class a graph) and much more.
Region current = ...;
List<Transition> neighbours = current.getNeighbours();
foreach (Transition t : neighbours)
System.out.println("To the " + transition.getDirection() + " is the " + transition.getTargetName());
// prints e.g. "To the north is the shadowy jungle"
There are a lot of ways todo this, and in an OOP language, you should really try to get into the mindset of using objects instead of setting integer flags or else.
Take a look on enum.
If the list of 9 regions should not grow as game develops, you can describe each of them in hardcoded fashion also utilizing power of objects. Enums can have custom properties and methods weawing them into your architecture, and also enums provide some extra benefits such as == comparison and using in switch blocks.
EDIT
I don't understand why this future addition might make enums a less desirable route
The only way to add an option to enum is to rewrite its class source code. That is, enum options are defined statically and in larger projects when developers should deal with product versions, compatibility, delivering to end-users etc., this could be a pain. In fact, any change in source code of published project is undesirable, since it requires recompilation and full rebuild of at least one (in best case) application module.
The way to deal with it is to move modifiable data into some resource (this can be a file, database table, plugin or anything easily modifiable without full rebuild) and make your application to initialize itself on startup in runtime. Since from this point your program no longer knows that data in advance, statically, there is no way you could define enum describing that data. And in this scenario custom classes (Polygnome's answer) will do the job. Your program reads the resource, creates and initializes objects in runtime that describe your configuration and uses dynamic data.
IMHO, there is almost always tradeoff beween flexibility and complexity. You gain flexibility and freedom to modify region list, but you have to deal with complexity of dynamic solution. Or you decide to use much simpler enums understanding their limited extensibility.
Btw, in order of growing flexibility (and complexity):
raw primitives (int/String) | enums | custom classes

OOP: Which class should own a method? [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
I’m having trouble understanding how classes relate to their methods. Is a method something that the object does, or something that’s done to it? Or is this a different concept entirely?
Specifically, in a library’s software system, should the borrow() method belong to the class representing the library patron, or the class representing the item that the patron is borrowing? My intuition is that it should read like patron.borrow(copy), like English sentence structure, subject.verb(object); but my instructor says that’s Wrong, and I don’t understand why he would have borrow() belong to the Copy class (and he doesn’t really explain things too well). I’m not looking for justification, but can someone just explain the proper relationship?
Edit: This question was closed as “off topic”. I don’t understand. Are software design questions not appropriate for this site?
subjective :) but honestly, I'd go with the Information Expert Pattern and say something like
library.lend(item, patron)
The library contains the information about the items it has (perhaps in its catalog).
The library lends the item to the patron (which it knows because it registers them)
Not sure how your instructor sees this, but this is the level of 'abstraction' (software objects mimicking real world entities) that would make sense for your scenario.
You should not confuse the idea of OOP with one specific incarnation like Java or C++.
This limit "methods are a property of the object" is not part of the OOP idea, but just of some implementations and as you discovered it doesn't scale well.
How many methods sould an "integer number" object have? What is more logical... myfile.write(myint) or myint.write(myfile)? There is really no good general answer to this. The idea of a method being part of a single object is a special case and sometimes the bending needed to fit the problem to this solution can become noticeable or even close to a showstopper. The answer is really totally acceptable only when a method has no parameters except the object being processed: single dispatch is a perfect answer only when there is a single type involved.
In other languages you have a separation between objects and methods, so for example you have the file object, the integer object and a method write(myfile, myint) that describes what to do when the operation is needed... and this method is neither part of the file nor of the integer.
Some generic words first.
Software construction is not something which should be governed by English language rules or "beauty" or whatever, it's engineering discipline. Think of whether your design solves the problem, whether it will be maintainable, whether it will be testable, whether it will be possible to parallelize development and so on. If you want something more formalized take a look at the "On the Criteria To Be Used in Decomposing Systems into Modules" by D. L. Parnas.
As for your library example. Imagine you have a Copy outside of library, shoult it have borrow method then? How the borrowing is registered? Are you ok with either Copy or Patron classes responsible for data storage? It looks more appropriate to put borrow into a Library class. Responsibilities will be clearly divided, you wouldn't need to know much about borrowing to implement Copy and Patron and you wouldn't need much details about them to implement Library.
Public methods exposed from a class are the tasks that can be performed on the entity.
That way the class would only encapsulate its behavior.
For example:
if i say
Computer.TurnOn()
The method will only work on the computer system.
instead if i say,
SomeOne.TurnonComputer()
The someone will now have the responsibility to turn on the computer(set related properties of computer), that means we are not meeting the concept of encapsulation and scattering the class's properties all over the place.
As #Ryan Fernandes said, the lend/borrow operation cannot be with either patron or book. It has to be with some class that knows about the status of all the books and patrons of the library. For e.g., are there pending reservations against a book? How many copies are available? Has this patron paid all the fees? Is he eligible for this book? So typically this should be in Library or a LibraryService class.
The point of OOP is to create polymorphic functions that, in each implementation, deal with a defined set of data which obey specific invariants.
It follows that a method which alters an object should be defined in the class of that object. It matters less where code that is purely functional lives, but it should probably live on the type of its input (if it takes a single input) or on its output.
In your example, if borrow alters data in copy, then it should live there. If, however, you model the loan status of a book by it being held in a particular collection (either in a patron, or in a collection for the library), it would make more sense to put borrow on the holder classes. That latter design, however, runs the risk that a copy could be in more than one collection, so you would want to put some information (and a corresponding method) on the copy as well.
Not pretty sure for the exact justification , but you can think it this way, IF multiple patients go and visit a doctor, its only the doctor who know when to call in the next patient, so the next method would be a part of Doctor's Responsibility, though its tempting to think that next should be the part of Patient's responsibility as he has to go next, someways when the library book is to be issued, it should be the responsibility of book genre rather patron as book(RESOURCE) knows when it will be free .
Is a method something that the object does, or something that’s done to it? Or is this a different concept entirely?
Let me clear something about class and objects first. Class are generally used to a denote particular category. Like
Cars not Ferrari, or Porsche
Fruits not Banana, or Apple
So, it's Ferrari that is driven, and a banana that is eaten. Not their class
Its always an object that has properties and has behavior.
Even going to your case specifically.
borrow() method is an action/behavior done by a object of a person on an object of book whose records is kept by another object of the library system itself.
A good way to represent this in OO way for me would be like
libray.borrow(new book('book title'), new person('starx'));
Just for fun, What do you think about this
person starx = new person('starx');
book title1 = new book('title1');
library libraryname = new library('libraryname');
libraryname.addBook(title1);
if(starx.request(title1, libraryname)) {
starx.take(library.lend(title1, starx));
}
I guess it can go either way. There is no hard and fast rule for it. The idea is the group functions logically that makes sense. To me, Patron#borrow(BookCopy) make same sense as BookCopy#borrow(Patron). Or you may have a class LibManager.borrow(BookCopy, Patron).
Your instructor's right. Well, actually, he's wrong. I don't know.
My point is, for questions such as this, there are often no firm general answers one way or another. It largely comes down to what works best in your particular case. Go with whatever's easiest to code - it'll be the easiest to maintain. And, by "easiest to code", I suggest also taking into account the intended users of the classes (beyond just your Library, Copy and Person classes).
I was thinking about precisely that today. I came to this conclusion:
Whichever makes more sense in the appropriate context.

Implicit vs Explicit data structures

Lately I've been struggling with some recurrent design problem which I don't know how to solve elegantly.
Say I am making a game with a couple of players and for each player some connected pieces. Together these pieces form a semi-complex collection or structure. Now I could implement this structure in 2 ways: Either store the structure implicitly through pointers in the pieces themselves i.e:
class BigPiece extends Piece {
Piece opposingPiece, nextPiece, previousPiece, index;
}
Or I could implement this structure in a collection class and keep the information centralized:
class SomeCollection<Collection<Piece>> {
SomeOtherCollection<Collection<Piece>> collection
= new SomeOtherCollection<Collection<Piece>>();
public SomeCollection() {
collection.add(new PieceCollection<Piece>();
collection.add(new PieceCollection<Piece>();
collection.add(new PieceCollection<Piece>();
}
public Piece getPiece(int playerIndex, int pieceIndex) {
collection.get(playerIndex).get(pieceIndex);
}
public Piece getOpposingPiece(int playerIndex, int pieceIndex) {
int nextPlayerIndex = collection.listIterator(playerIndex).nextIndex();
return this.collection.get(nextPlayerIndex).get(pieceIndex);
}
}
Now I usually favor the second one, but that's just based on my guts and I don't have that much experience in class design, especially not with big applications. I can see pros and cons on both sides.
The problem I usually have with the first solution is that you still have to create the associations in some builder or factory which actually links the objects together. This doesn't seem very robust to me. Who can reassure me all the pointers are actually correct throughout the application's lifetime?
The second solution centralizes the data more. This really dumbs down the higher classes though (such as individual Pieces). The problem I usually have with this is that whenever I want to traverse this collection, I have to do it on some lower level. You can't ask a piece 'Hey, what's your opposing piece?'. No, you'd have to get a game object to get a pointer to your collection which you then ask what the opposing piece is. This makes more 'managery' classes which collect data from all around your application (method chaining =( ) to finally implement your algorithm. This seems to violate the Law of Demeter.
Sure I could add a pointer to the corresponding collection from each individual piece as well, but I don't know if that's such a good idea since this only seems to be duplicate information.
My personal recommendation is moreso the second option as opposed to the first. As you pointed out, a piece shouldn't (at least in this context) know what its opposing/next/previous piece is.
A manager class would make more logical sense to better facilitate communication between the classes instead of pieces having references to other pieces. I admit I don't fully know about the Law of Demeter but Wikipedia leads me to believe it is all about encapsulation which the manager classes would actually help as well!
I don't think Pieces (again, in this context) should be able to, say, move another piece. However a manager class would logically want to.
That is my suggestion, I hope it helps!

Empirical data on the effects of immutability?

In class today, my professor was discussing how to structure a class. The course primarily uses Java and I have more Java experience than the teacher (he comes from a C++ background), so I mentioned that in Java one should favor immutability. My professor asked me to justify my answer, and I gave the reasons that I've heard from the Java community:
Safety (especially with threading)
Reduced object count
Allows certain optimizations (especially for garbage collector)
The professor challenged my statement by saying that he'd like to see some statistical measurement of these benefits. I cited a wealth of anecdotal evidence, but even as I did so, I realized he was right: as far as I know, there hasn't been an empirical study of whether immutability actually provides the benefits it promises in real-world code. I know it does from experience, but others' experiences may differ.
So, my question is, have there been any statistical studies done on the effects of immutability in real-world code?
I would point to Item 15 in Effective Java. The value of immutability is in the design (and it isn't always appropriate - it is just a good first approximation) and design preferences are rarely argued from a statistical point of view, but we have seen mutable objects (Calendar, Date) that have gone really bad, and serious replacements (JodaTime, JSR-310) have opted for immutability.
The biggest advantage of immutability in Java, in my opinion, is simplicity. It becomes much simpler to reason about the state of an object, if that state cannot change. This is of course even more important in a multi-threaded environment, but even in simple, linear single-threaded programs it can make things far easier to understand.
See this page for more examples.
So, my question is, have there been
any statistical studies done on the
effects of immutability in real-world
code?
I'd argue that your professor is just being obtuse -- not necessarily intentionally or even a bad thing. Its just that the question is too vague. Two real problems with the question:
"Statistical studies on the effect of [x]" doesn't really mean anything if you don't specify what kind of measurements you're looking for.
"Real-world code" doesn't really mean anything unless you state a specific domain. Real world code includes scientific computing, game development, blog engines, automated proof generators, stored procedures, operating system kernals, etc
For what its worth, the ability for the compiler to optimize immutable objects is well-documented. Off the top of my head:
The Haskell compiler performs deforestation (also called short-cut fusion), where Haskell will transform the expression map f . map g to map f . g. Since Haskell functions are immutable, these expressions are guaranteed to produce equivalent output, but the second function runs twice as fast since we don't need to create an intermediate list.
Common subexpression elimination where we could convert x = foo(12); y = foo(12) to temp = foo(12); x = temp; y = temp; is only possible if the compiler can guarantee foo is a pure function. To my knowledge, the D compiler can perform substitutions like this using the pure and immutable keywords. If I remember correctly, some C and C++ compilers will aggressively optimize calls to these functions marked "pure" (or whatever the equivalent keyword is).
So long as we don't have mutable state, a sufficiently smart compiler can execute linear blocks of code multiple threads with a guarantee that we won't corrupt the state of variables in another thread.
Regarding concurrency, the pitfalls of concurrency using mutable state are well-documented and don't need to be restated.
Sure, this is all anecdotal evidence, but that's pretty much the best you'll get. The immutable vs mutable debate is largely a pissing match, and you are not going to find a paper making a sweeping generalization like "functional programming is superior to imperative programming".
At most, you'll probably find that you can summarize the benefits of immutable vs mutable in a set of best practices rather than as codified studies and statistics. For example, mutable state is the enemy of multithreaded programming; on the other hand, mutable queues and arrays are often easier to write and more efficient in practice than their immutable variants.
It takes practice, but eventually you learn to use the right tool for the job, rather than shoehorning your favorite pet paradigm into project.
I think your professor's being overly stubborn (probably deliberately, to push you to a fuller understanding). Really the benefits of immutability are not so much what the complier can do with optimisations, but really that it's much easier for us humans to read and understand. A variable that is guaranteed to be set when the object is created and is guaranteed not to change afterwards, is much easier to grok and reason with than one which is this value now but might be set to some other value later.
This is especially true with threading, in that you don't need to worry about processor caches and monitors and all that boilerplate that comes with avoiding concurrent modifications, when the language guarantees that no such modification can possibly occur.
And once you express the benefits of immutability as "the code is easier to follow", it feels a bit sillier to ask for empirical measurements of productivity increases vis-a-vis "easier-to-followness".
On the other hand, the compiler and Hotspot can probably perform certain optimisations based on knowing that a value can never change - like you I have a feeling that this would take place and is a good things but I'm not sure of the details. It's a lot more likely that there will be empirical data for the types of optimisation that can occur, and how much faster the resulting code is.
Don't argue with the prof. You have nothing to gain.
These are open questions, like dynamic vs static typing. We sometimes think functional techniques involving immutable data are better for various reasons, but it's mostly a matter of style so far.
What would you objectively measure? GC and object count could be measured with mutable/immutable versions of the same program (although how typical that would be would be subjective, so this is a pretty weak argument). I can't imagine how you could measure the removal of threading bugs, except maybe anecdotally by comparison with a real world example of a production application plagued by intermittent issues fixed by adding immutability.
Immutability is a good thing for value objects. But how about other things? Imagine an object that creates a statistic:
Stats s = new Stats ();
... some loop ...
s.count ();
s.end ();
s.print ();
which should print "Processed 536.21 rows/s". How do you plan to implement count() with an immutable? Even if you use an immutable value object for the counter itself, s can't be immutable since it would have to replace the counter object inside of itself. The only way out would be:
s = s.count ();
which means to copy the state of s for every round in the loop. While this can be done, it surely isn't as efficient as incrementing the internal counter.
Moreover, most people would fail to use this API right because they would expect count() to modify the state of the object instead of returning a new one. So in this case, it would create more bugs.
As other comments have claimed, it would be very, very hard to collect statistics on the merits of immutable objects, because it would be virtually impossible to find control cases - pairs of software applications which are alike in every way, except that one uses immutable objects and the other does not. (In nearly every case, I would claim that one version of that software was written some time after the other, and learned numerous lessons from the first, and so improvements in performance will have many causes.) Any experienced programmer who thinks about this for a moment ought to realize this. I think your professor is trying to deflect your suggestion.
Meanwhile, it is very easy to make cogent arguments in favor of immutability, at least in Java, and probably in C# and other OO languages. As Yishai states, Effective Java makes this argument well. So does the copy of Java Concurrency in Practice sitting on my bookshelf.
Immutable objects allow code which to share an object's value by sharing a reference. Mutable objects, however, have the identity that code which wants to share an object's identity to do so by sharing a reference. Both kinds of sharing are essential in most applications. If one doesn't have immutable objects available, it's possible to share values by copying them into either new objects or objects supplied by the intended recipient of those values. Getting my without mutable objects is much harder. One could somewhat "fake" mutable objects by saying stateOfUniverse = stateOfUniverse.withSomeChange(...), but would requires that nothing else modify stateOfUniverse while its withSomeChange method is running [precluding any sort of multi-threading]. Further, if one were e.g. trying to track a fleet of trucks, and part of the code was interested in one particular truck, it would be necessary for that code to always look up that truck in a table of trucks any time it might have changed.
A better approach is to subdivide the universe into entities and values. Entities would have changeable characteristics, but an immutable identity, so a storage location of e.g. type Truck could continue to identify the same truck even as the truck itself changes position, loads and unloads cargo, etc. Values would not have generally have a particular identity, but would have immutable characteristics. A Truck might store its location as type WorldCoordinate. A WorldCoordinate that represents 45.6789012N 98.7654321W would continue to so as long as any reference to it exists; if a truck that was at that location moved north slightly, it would create a new WorldCoordinate to represent 45.6789013N 98.7654321W, abandon the old one, and store a reference to that new one.
It is generally easiest to reason about code when everything encapsulates either an immutable value or an immutable identity, and when the things which are supposed to have an immutable identity are mutable. If one didn't want to use any mutable objects outside a variable stateOfUniverse, updating a truck's position would require something like:
ImmutableMapping<int,Truck> trucks = stateOfUniverse.getTrucks();
Truck myTruck = trucks.get(myTruckId);
myTruck = myTruck.withLocation(newLocation);
trucks = trucks.withItem(myTruckId,myTruck);
stateOfUniverse = stateOfUniverse.withTrucks(trucks);
but reasoning about that code would be more difficult than would be:
myTruck.setLocation(newLocation);

Downsides to immutable objects in Java? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
The advantages of immutable objects in Java seem clear:
consistent state
automatic thread safety
simplicity
You can favour immutability by using private final fields and constructor injection.
But, what are the downsides to favouring immutable objects in Java?
i.e.
incompatibility with ORM or web presentation tools?
Inflexible design?
Implementation complexities?
Is it possible to design a large-scale system (deep object graph) that predominately uses immutable objects?
But, what are the downsides to
favouring immutable objects in Java?
incompatibility with ORM or web
presentation tools?
Reflection based frameworks are complicated by immutable objects since they requires constructor injection:
there are no default arguments in Java, which forces us to ALWAYS provide all of the necessary dependencies
constructor overriding can be messy
constructor argument names are not usually available through reflection, which forces us to depend on argument order for dependency resolution
Implementation complexities?
Creating immutable objects is still a boring task; the compiler should take care of the implementation details, as in groovy
Is it possible to design a large-scale system (deep object graph) that predominately uses immutable objects?
definitely yes; immutable objects makes great building blocks for other objects (they favor composition) since it's much easier to maintain the invariant of a complex object when you can rely on its immutable components. The only true downside to me is about creating many temporary objects (e.g. String concat was a problem in the past).
With immutability, any time you need to modify data, you need to create a new object. This can be expensive.
Imagine needing to modify one bit in an object that consumes several megabytes of memory: you would need to instantiate a whole new object, allocate memory, etc. If you need to do this many times, mutability becomes very attractive.
If you go for mutability then you will find that whenever you need to call a method that you don't want to have the object change, or you need to return an object that is part of the internal state, you need to make a defensive copy.
If you really look at programs that make use of mutible objects you will find that they are prone to "attack" by modifying:
objects passed to constructors
objects passed to methods
objects returned from methods.
The issue doesn't show up very often because most programs don't change the data (they are in reality immutable by virtue of them never changing).
I personally make every thing I possibly can final. I probably have 90%-95% of all variables (parameters, local, instance, static, exceptions, etc...) marked as final. There are some cases where it has to be mutable, but the vast majority of cases it does not.
I think it might depend on your focus. If you are writing libraries for 3rd parties to use you think about this much more than if you are writing an application that only you (or your team) will maintain.
I find that you can write large scale applications using immutable objects for the majority of the system without too much pain.
Fundamentally, in the real world, the state associated with many particular identities will change. If I ask what is "the present position of Joe's Buick", today it might be a location in Seattle, and tomorrow it might be a location in Los Alamos. It would be possible to define and create a GeographicLocation object whose value will always represent the location where Joe's Buick was at some particular moment in time and would never changes--if today it represents a spot in Seattle, then it will always do so. Such an object, however, would have no continuing identity as "the present location of Joe's Buick".
It may also be possible to define things so that there is a VehicleLocation object which is connected to Joe's Buick such that the object always represents "the present location of Joe's Buick". Such an object could retains its identity as "the present location of Joe's Buick", even as the car moves around, but would not represent a constant geographical location. Defining "identity" may be tricky if one considers the scenario where Joe sells his Buick to Bob and buys a Ford--should the object track "the present location of Joe's Ford" or "the present location of Bob's Buick"--but in many cases such issues may be avoided by using a data model that guarantees that some aspects of object identity will never change.
It isn't possible for everything about an object to be immutable. If an object is immutable, then it cannot have an immutable identity that encapsulates anything beyond its current state. If an object is mutable, however, it can have an immutable identity whose meaning transcends its present state. In many situations, having an immutable identity is more useful than having an immutable state, and in such situations mutable objects are nearly essential. While it is possible in some cases to "simulate" mutable objects by having an immutable object which would search through the most recent version of an immutable objects to find information that may "change" between one version and the next, such an approaches are often extremely inefficient. Even if one could magically receive once per minute a bound book that gave the location of every vehicle everywhere, looking up "Joe's Buick" in the book would take a lot longer than merely asking a "present location of Joe's Buick" object which would always know where the car was.
You pretty much answered your own question. The JavaBean specification, I don't believe, mentions anything about immutability, yet JavaBeans are the bread and butter of many Java frameworks.
The concept of immutable types is somewhat uncommon for people used to imperative programming styles. However, for many situations immutability has serious advantages, you named the most important ones already.
There are good ways to implement immutable balanced trees, queues, stacks, dequeues and other data structures. And in fact many modern programming languages / frameworks only support immutable strings because of their advantages and sometimes also other objects.
With an immutable object, if the value needs to be changed, then it must be replaced with a new instance. Depending on the lifecycle of the object, replacing it with a different instance can potentially increase the tenured (long) garbage collection time. This becomes more critical if the object is kept around in memory long enough to be placed in the tenured generation.
The problem in java is that one has to live with all those objects, where the class looks like:
class Mutable {
State1 f1;
MoreState f2;
void doSomething() { // mutate the state, but don't document it }
void doSomethingElse() /// mutate the state heavily, do not mention in doc
}
(Note the missing Cloneable interface).
The problem with the garbage collector is not such a big one nowadays. The VM's are happy with short living objects.
Advances in Compiler/JIT technology will make it possible, sooner or later, to optimize intermediate temporary object creation away. For example:
BigInteger three =, two =, i1 = ...;
BigInteger i2 = i1.mul(three).div(two);
The JIT could notice that the intermediate object i1.mul(three) can be used for the end result and call a variant of the div method that works on a mutable accumulator.
See Functional Java to attain a comprehensive answer to your question.
Immutability, as every other design pattern, should only be used when you need it. You give the example of thread safety: In a highly threaded application, you could favor immutability over the added expense of making it thread safe yourself.
However, if your design requires objects to be mutable, don't go out of your way to make them immutable, just because "it's a design pattern".
As for your graph, you could choose to make your nodes immutable and let another class take care of the connections between them, or you could make a mutable node that takes care of its own children and has an immutable value class.
Probably the biggest cost of using immutabile objects in Java is that future developers won't be expecting it or used to that style. Expect to either document heavily or watch alot of your objects spawn mutable peers over time.
That being said, the only real technical reason I can think of to avoid immutable objects is GC churn. For most applications, I don't think this is a compelling reason to avoid them.
The biggest thing I've ever done with a ~90% immutable objects was a toy scheme-esque interpreter, so its certainly possible to do complex Java projects.
in immutable data you dont set things twice... see haskell and scala vals (and clojure of cource)...
for example.. for a data structure.. like a tree, when you perform write operation to the tree, in fact you are adding elements outside of the immutable tree.. after you done.. the tree and the branch are recombined in a new tree.. so like this you could perform concurrent reads and writes very safelly..
in tradicional model, you must lock a value cause it could be reseted any time.. so.. you end up with a very heat zone for threads..since they act sequentially there anyway..
with imuttable data, you dont set things more than once.. its a whole new way of programming.. you may end up using a little bit more memory.. but parallelizing is natural and painless..
As with any tool, you have to know when to use it and when not to.
Like Tehblanx points out that if you want to change the state of a variable that holds an immutable object, you have to create a new object, which can be expensive, especially if the object is big and complex. Absolutely true, but that simply means that you have to intelligently decide which objects should be mutable and which should be immutable. If someone is saying that ALL objects should be immutable, well, that's just crazy talk.
I'd tend to say that objects that represent a single logical "fact" should be immutable, while objects that represent multiple facts should be mutable. Like, an Integer or a String should be immutable. A "Customer" object that contains name, address, current amount, date of last purchase, etc should be mutable. Of course I can immediately think of a hundred exceptions to such a general rule. An exception I make all the time is when I have a class that just exists as a wrapper to hold a primitive in some case where a primitive is not legal, like in a collection, but I need to update it constantly.
In Java, a method can't return multiple objects, like return a, b, c. Returning an array of objects makes the code look ugly. In this situation, I have to pass mutable objects to the method and let it change the states of these objects. However, I don't know whether returning multiple objects is a code smell or not.
The answer is none. There are not any good reasons to be mutable.
You do run in to problems with lots of frameworks(or framework versions) that require mutable objects in order to work with them(Spring I am glaring in your direction). As you work with them and fish through the code you will shake your fist in anger that you need to introduce dirty mutability into an otherwise glorious block of code when it could have been easily avoided.
I'm sure there are limited corner cases(probably more hypothetical that anything) where the overhead of object creation and collection is uncceptable. But I urge the people that would make this argument to look at languages like scala where included collections are immutable by default and then look at the bevy of performance critical apps built on top of that concept.
This is of course hyperbole. In reality, you should go with immutability first, see if it causes you any measurable problems, if it does then introduce mutability, but make sure you can prove it solves your problem. Otherwise you've just created liability for no benefit. In doing this I think you'll find objective cases for "Implementation Complexity" and "Inflexibility" very hard to make.
Some implementations of immutable objects have transactional means to update an immutable object. Similar to how databases provide safe commits and rollbacks. But in apparent contrast with many of the answers here. Immutable objects are never changed. A typical operation would be.
B = append(A,C)
B is a new object. Just like A and C. No modification was made to A or C. Internally a red black tree implementation makes such semantics fast enough to be usable.
The downside is that it is not as fast as making the operations in place. But that only compares a single part of the system. When evaluating possible downsides we need to look at the system as a whole. And I personally don't have a clear picture of the entire impact. Although I suspect immutability wins out at the end.
I know some experts contend there is contention at the top level of the red black tree. And that has a negative effect in throught-put.
My biggest worry with immutable data structures is how to save/reconstitute them. That is, if a class has final fields, I can't instantiate it and then set its fields.

Categories