Is it good practice to use domain objects as keys? - java

Is is good practice to use domain objects as keys for maps (or "get" methods), or is it better to just use the id of the domain object?
It's simpler to explain with an example. Let's say I have Person class, a Club class, and a Membership class (that connects the other two). I.e.,
public class Person {
private int id; // primary key
private String name;
}
public class Club {
private String name; // primary key
}
public class Membership {
private Person person;
private Club club;
private Date expires;
}
Or something like that. Now, I want to add a method getMembership to Club. The question is, should this method take a Person object:
public Membership getMembership(Person person);
or, the id of a person:
public Membership getMembership(int personId);
Which is most idiomatic, which is most convenient, which is most suitable?
Edit: Many very good answers. I went with not exposing the id, because the "Person" (as you might have realized, my real domain does not have anything to do with people and clubs...) instances are easily available, but for now it is internally stored in a HashMap hashed on the id - but at least I am exposing it correctly in the interface.

Don't use the id's man, this is just a bad idea for all the reasons mentioned. You'll lock yourself into a design. Let me give an example.
Right now you define you're Membership as a mapping between Clubs to People. Rightfully, your Membership should be a map of Clubs to "Members", but you are assuming that all Members are People and that since all of the people id's are unique you think you can just use the ID.
But what if in the future you want to extend your membership concept to "family memberships", for which you create a Family table and a Family class. In good OO fashion you extract an interface of Family and Person called Member. As long as both classes implement the equals and hashCode methods properly, no other code will have to be touched. Personally, I would have defined the Member interface right up front.
public interface Member {
}
public class Person implements Member {
private int id; // primary key
private String name;
}
public class Family implements Member {
private int id;
private String name;
}
public class Club {
private String name; // primary key
}
public class Membership {
private Member member;
private Club club;
private Date expires;
}
If, you had used ID's in your interface, you will either need to enforce cross-table uniqueness of key values, or maintain two separate Maps and forgo the nice polymorphic interface stuff.
Believe me, unless you are writing one-off, disposable applications, you want to avoid using ID's in your interface.

Assuming this is a database ID or something used just for indexing (rather than something like an SSN), then in an ideal system, the presence of an ID is an implementation detail.
As an implementation detail, I would prefer to hide it in the interface of other domain objects. Thus, membership involves, fundamentally, individuals rather than numbers.
Of course, I'd make sure I implemented hashCode and equals() and documented well what they meant.
In that case, I would explicitly document that the equality of two Person objects is determined solely based on ID. This is somewhat a risky proposition, but makes code more readable if you can ensure it. I feel more comfortable making it when my objects are immutable, so I would not actually end up with two Person objects with the same ID but different names in the lifetime of the program.

I think the first case would be considered "purer" in the sense that the getMembership method might require more specific data from the person itself other than its id (Let's assume you do not know the internals of the getMembership method, even though this makes little sense since it's most likely in the same domain).
If it turns out that it actually requires data from the Person entity then it will not require a DAO or factory for the person in question.
This can be easily called if your language and/or ORM allows you to use proxy objects (and if you have a convenient way to create these proxies).
But lets be honest. If you're inquiring about some membership of a person, you most likely already have this Person instance in memory at hand when you call this method.
Further down the road in the "infrastructure land" there's also this notion about implementation details which Uri already mentioned while I was writing this answer (damn, that was fast bro'!). To be specific, what if you decided that this 'Person' concept suddenly has a composite primary key/identifier in the underlying database... Would you now use an identifier class? Perhaps use that proxy we were talking about?
TL;DR version
Using ID's is really easier in the short run, but if you're already using a solid ORM, I see no reason not to use proxies or some other means to express the object oriented identity of an Entity which doesn't leak implementation details.

If you are really practicing object oriented design, then you want to invoke the idea of information hiding. As soon as you start hanging internal field types of the person object in the public interface of the membership object's methods, you start forcing external developers (users) of your objects to start learning all kinds of information about what a person object is, and how it is stored, and what kind of ID it has.
Better yet, since a person can have memberships, why don't you just hang the "getMemberships" method onto the person class. It seems much more logical to ask a person which memberships they have, than to ask a "membership" which clubs a given person may belong to...
Edit - since the OP has updated to indicate that it is the membership itself that he is interested in, and not just used as a relation between Person and Club, I'm updating my answer.
Long story short, the "Club" class that you are defining, you are now asking to behave as a "club roster". A club has a roster, it isn't is a roster. A roster could have several features, including ways to look up persons belonging to the club. In addition to looking up a person by their club ID, you might want to look them up by SSN, name, join date, etc.. To me, this says there is a method on class "Club" called getRoster(), which returns a data structure that can lookup all the persons in the club. Call it a collection. The question then becomes, can you use the methods on pre-existing collections classes to fulfill the needs you have defined so far, or do you need to create a custom collection subclass to provide the appropriate method to find the membership record.
Since your class heirarchy is most likely backed by a database, and you are probably taking about loading info out of the database, and don't necessarily want to get the entire collection just to get one membership, you may want to create a new class. This class could be called as I said "Roster". You would get the instance of it from the getRoster() call on class "club". You would add "searching" methods to the class based on any search criteria you wanted that was "publicly available" information about the person.. name, clubID, personID, SSN, etc...
My original answer only applies if the "membership" is purely a relation to indicate which clubs which persons belong to.

IMO, I think it very much depends on the flow of the application - do you have the Person available when you want to get the Membership details? If yes, go with:
public Membership getMembership(Person person);
Also, I don't see any reason why the Club cannot keep track of memberships based on the Person's ID and not the actual object - I think that would mean you don't need to implement the hashCode() and equals() methods. (Although that is always a good best-practice).
As Uri said, you should document the deceleration that two Person objects are equal if their ID is equal.

Whoa. Back up a sec here. The getMembership() method doesn't belong in Club. It belongs to the set of all memberships, which you haven't implemented.

I would probably use IDs. Why? By taking IDs, I'm making safer assumptions about the caller.
If I have an ID, how much work is it to get the Person? Might be 'easy', but it does require hitting a datastore, which is slow...
If I have Person object, how much work is it to get the ID? Simple member access. Fast and available.

As described by others: use the object.
I work on a system where we had some old code that used int to represent transaction ids. Guess what? We started running out of ids because we used int.
Changing to long or BigNumber proved tricky because people had become very inventive with naming. Some used
int tranNum
some used
int transactionNumber
some used
int trannNum
(complete with spelling mistakes).
Some people got really inventive...
It was a mess and sorting it out was a nightmare. I ended up gping through all of the code manually and converting to a TransactionNumber object.
Hide the details wherever possible.

I would typically stick with less is more. The less information required to invoke your method the better. If you know the ID, only require the ID.
If you want, provide extra overloads which accept extra parameters, such as the entire class.

If you already have the object, there's no reason to pull out the ID to get a hash key.
As long as the IDs are always unique, implement hashCode() to return the ID, and implement equals() as well.
Odds are every time you'll need the Membership, you'll already have the Person, so it saves code and confusion later.

First of all I'd put any getters of such nature inside a DAO (and not on the model). Then I'd use the entity itself as a parameter, and what happens inside the method is an implementation detail.

Unless there's a significant benefit derived elsewhere, it can be said that keys in map should single-valued things, if at all possible. That said, through paying attention to equals() and hashCode() you can make any object work as key, but equals() and hashCode() aren't very pleasing things to have to pay attention to. You'll be happier sticking to IDs as keys.

Actually, what I would do is call it by id, but refactoring a bit the original design:
public class Person {
private int id; // primary key
private String name;
}
public class Club {
private String name; // primary key
private Collection<Membership> memberships;
public Membership getMembershipByPersonId(int id);
}
public class Membership {
private Date expires;
private Person person;
}
or
public class Person {
private int id; // primary key
private String name;
private Membership membership;
public Membership getMembership();
}
public class Club {
private String name; // primary key
private Collection<Person> persons;
public Person getPersonById(int id);
}
public class Membership {
private Date expires;
}

Related

DDD - Value Object flavor of an Entity

I've seen some DDD projects with value object representations of entities.
They usually appear like EmployeeDetail, EmployeeDescriptor, EmployeeRecord, etc. Sometimes it holds the entity ID, sometimes not.
Is that a pattern? If yes, does it have a name?
What are the use cases?
Are they value objects, parameter objects, or anything else?
Are they referenced in the domain model (as property) or are they "floating" just as parameters and returns of methods?
Going beyond...
I wonder if I can define any aggregate as an ID + BODY (detail, descriptor, etc) + METHODS (behavior).
public class Employee {
private EmployeeID id;
private EmployeeDetail detail; //the "body"
}
Could I design my aggregates like this to avoid code duplication when using this kind of object?
The immediate advantage of doing this is to avoid those methods with too many parameters in the aggregate factory method.
public class Employee {
...
public static Employee from(EmployeeID id, EmployeeDetail detail){...};
}
instead of
public class Employee {
...
public static Employee from(EmployeeID id, + 10 Value Objects here){...};
}
What do you think?
What you're proposing is the idiomatic (via case classes) approach to modeling an aggregate in Scala: you have an ID essentially pointing to a mutable container of an immutable object graph representing the state (and likely some static functions for defining the state transitions). You are moving away from the more traditional OOP conceptions of domain-driven design to the more FP conceptions (come to the dark side... ;) ).
If doing this, you'll typically want to partition the state so that operations on the aggregate will [as] rarely [as possible] change multiple branches of the state, which enables reuse of as much of the previous object graph as possible.
Could I design my aggregates like this to avoid code duplication when using this kind of object?
What you are proposing is representing the entire entity except its id as a 'bulky' value object. A concept or object's place in your domain (finding that involves defining your bounded contexts and their ubiquitous languages) dictates whether it is treated as a value object or an entity, not coding convenience.
However, if you go with your scheme as a general principle, you risk tangling unrelated data into a single value object. That leads to many conceptual and technical difficulties. Take updating an entity for example. Entities are designed to evolve in their lifecycle in response to operations performed on it. Each operation updates only the relevant properties of an entity. With your solution, for any operations, you have to construct a new value object (as value objects are defined to be immutable) as replacement, potentially copying many irrelevant data.
The examples you are citing are most likely entities with only one value object attribute.
OK - great question...
DDD Question Answered
The difference between an entity object and a value object comes down to perspective - and needs for the given situation.
Let's take a simple example...
A airplane flight to your favourite destination has...
Seats 1A, 10B, 21C available for you too book (entities)
3 of 22 Seats available (value object).
The first reflects individually identifiable seat entities that could be filled.
The second reflects that there are 3 seats available (value object).
With value object you are not concerned with which individual entities (seats) are available - just the total number.
It's not difficult to understand that it depends on who's asking and how much it matters.
Some flights you book a seat and others you book a (any) seat on a plane.
General
Ask yourself a question! Do I care about the individual element or the totality?
NB. An entity (plane) can consider seats, identity and / or value object - depending on use case. Also worth noting, it has multiple depends - Cockpit seats are more likely to be entity seats; and passenger seats value objects.
I'm pretty sure I want the pilot seat to have a qualified pilot; and qualified co-pilot; but I don't really care that much where the passengers seats. Well except I want to make sure the emergency exit seats are suitable passengers to help exit the plane in an emergency.
No simple answer, but a complex set of a pieces to thing about, and to consider for each situation and domain complexity.
Hope that explains some bits, happy to answer follow-up questions...

How to avoid loading duplicate objects into main memory?

Suppose I am using SQL and I have two tables. One is Company, the other is Employee. Naturally, the employee table has a foreign key referencing the company he or she works for.
When I am using this data set in my code, I'd like to know what company each employee works for. The best solution I've thought of it to add an instance variable to my Employee class called Company (of type Company). This variable may be lazy-loaded, or populated manually.
The problem is that many employees work for the same company, and so each employee would end up storing a completely identical copy of the Company object, unnecessarily. This could be a big issue if something about the Company needs to be updated. Also, the Company object would naturally store a list of its employees, therefore I could also run into the problem of having an infinite circular reference.
What should I be doing differently? It seems object oriented design doesn't work very well with relational data.
This is more of a design/principles sort of question, I do not have any specific code, I am just looking for a step in the right direction!
Let me know if you have any questions.
Do not try design your business objects to mirror database schema.
Design objects to serve your business requirements.
For example in case when you need to display list of employees without company information, you can create function which retrieve only required information from database to the object
public class EmployeeBasicInfo
{
public int Id;
public string Name;
}
For next requirements you need a list of employees with full information - then you will have function which retrieve full data from database
public class Employee
{
public int Id;
public string Name;
public int Age;
public CompanyBasicInfo Company;
}
Where Company class will not have collection of employees, but will have only information required for Employee class.
public class CompanyBasicInfo
{
public int Id;
public string Name;
}
Of course in last case you end up with bunch of different Company objects which will have same data. But it should be Ok.
If you afraid that having same copy of data in different object will cause a performance problem, it will not until you will load millions of employees - which should be good sign of something gone wrong in your application design.
Of course in situation where you actually need to load millions of employees - then you can use approach that class which loads employees - will first load all companies in the Map<int, Company>, and then when loading employees you will refer same Company instance for employees.
Am I really the only person who is running into this issue? There must be some way to do this without relying on lazy-loading every property.
This problem has been solved many times before already. Avoid re-inventing the wheel by using any of the widely available ORM frameworks.
In a database table, the primary key identifies a record; in a running application, the reference tracks an object; and, at an even lower abstraction, a memory address points to the bytes that represent that object.
When you initialise an object and assign it to a variable, the variable is sufficient to track the object in memory so that you can subsequently access it. However, in the database layer, a primary key is needed to locate the record in a database table. Therefore, to bridge the gap between the relational model and the object model, the artificial identifier property is required in your object.

"Encapsulation helps make sure clients have no dependence on the choice of representation"

I am familiar with the concept of encapsulation. However, recently I have found the following statement regarding encapsulation (which is according to the author correct):
Encapsulation helps make sure clients have no dependence on the choice of representation
Could you please give me a hint what is meant by clients and the choice of representation. Many thanks in advance.
What the author is trying to explain is the fact that encapsulation allows you to modify the inner representation of the data in some without have any side effect in the clients. The clients could be any other classes that are using yours, and the choice of representation the way you decide to store the data in your class.
As an example, imagine that you have a class where you store the Employees of some Company. It could be something like this:
public class Company {
private List<Employee> employees;
public List<Employee> getEmployees() {
return this.employees;
}
public Employee getEmployee(String employeeId) {
//search for employee
}
}
You store the employees of the company in a List, and provide two methods, one that retrieves all the employees, and another that searches for a given one. Some day you realize that maybe a Set or a Map would be a better structure to store the Employees, and you decide to refactor the code. As long as you have provided methods to retrieve the information of the employees, instead of giving direct access to the "employees" structure, you could change the implementation of these functions to fit with the new definition of your class and your clients wouldn't notice those changes.

Opinions on using a domain object for a single string

I am looking for some opinions when it comes to a small design decision.
I looked for other similar questions but I couldn't find any.
So here's the situation, the database contains a table with names of cities. The application retrieves these names and uses them somewhere.
Normally when it comes to database objects I would (and I believe you should) create a domain object in which you store all the related variables.
Ex:
public class City {
private String name;
public City(String _name){
this.name = _name;
}
public String getName() {
return name;
}
}
But in this case, I think that this is unnecessarily complex since there is only a single string for each database object. So I saved the city name in a String. The main reason for this is because I think it uses less memory (although the difference is small (I think, I'm not an expert)), it also removes one class and a small number of lines from the codebase. And it saves me from typing said lines ;)
I would love to hear your thoughts on this, what you would personally do and what are potential advantages or disadvantages of both methods.
In my opinion, it is better to use a domain object because it will give you more flexibility if you decide to add some more attributes. Let's say for example you pass this value to some methods in your code, next time you want to add some attribute to your city object it will be easier because all of the methods are already getting an object and not a single string.

Does it make sense to use JPA inheritance as a way to get different method implementations?

So, I have been working on familiarizing myself with JPA's inheritance features and have really liked them so far. One thing that occurred to me recently is that they could actually be used for something other than just retrieving data. Given that it can get subclasses based on a discriminator value, inheritance is actually a convenient way to transform configuration fields into implementations. Being in that stage where my knowledge-to-experience ratio is in the 'just enough to be dangerous/not enough to always realize it zone', I thought it might be best to ask if this was a good idea.
Take this example with a PRODUCT and BILLTYPE table.
Product:
int Id
int billtypeid
Billtype:
int id
varchar[15] description
Billtype is simply a billing strategy for the product (We'll say some orders may be billed by weight, while others could just be billed by case). Each bill type will require the use of different methods during the invoicing process. The Billtype table will likely only have a handful of entries, and shouldn't grow to be very large.
Would it make sense to use inheritance to subclass an abstract Billtype entity that also defines an interface for the different methods the invoice code will need? Something like this:
#Entity
#DiscriminatorColumn("description")
public abstract class BillType {
// Getters, setters
// Abstract methods that could be used elsewhere - ex:
// BigDecimal calculateInvVal(...)
}
#Entity
#DiscriminatorValue("by case")
public class CaseBillType extends BillType {
// Implementation of calculateInvVal - now when invoicing code needs this method,
// the right one is always associated with the current product!
}
This provides a convenient way to associate behaviors with fields in the database that represent configuration data, but mixes business code with entities (which, by most accounts, is very very naughty). There could be a design pattern to fix this issue that I am missing from my repertoire, but I'd really like to avoid having to write lots of, "if bill type is this, get this subclass, if bill type is this, etc" code.
What I am looking for from an answer is an explanation of potential drawbacks to this technique I may not be seeing that would justify looking for another solution to this problem.
It's useful to link a product with a BillType entity if it's possible to add, remove and modify bill types at runtime without any need to rebuild and redeploy a new version of the application. This is not the case with your example.
So if what you have is a static set of bill types, each defining a static behavior encapsulated by the BillType subclass, you could simply have a BillType enum instead. Each instance of this enum defining its own behavior. You don't need an entity hierarchy and an additional table for this.
The code to calculate the InVal in the Product entity would be exactly the same:
BigDecimal computeInVal() {
billType.calculateInVal(this);
}
The code to get all the bill types would be
return BillType.values();
And instead of the following code to associate a bill type to a product:
product.setBillType(em.find(BillType.class, ID_OF_CASE_BILL_TYPE));
you would simply have
product.setBillType(BillType.BY_CASE);

Categories