Should entity hold reference to repository?

Should entity hold reference to repository? - java

Suppose we have class Home and we want to have collection of all Cats inside this home, but also we want to have general repository of Cats that has all the cats available in the world. Should Home hold the reference to specific repository (or maybe collection) of Cats, or should I just make another lookup in general repository?

From a domain-driven design perspective you shouldn't have one aggregate root (AR) instance contained in another AR instance and typically one also would not have a reference to a repository in any entity.
So if Home and Cat are both ARs then Home should contain only a list of Cat Ids or a list of value objects (VO) that each represent a cat, e.g. HomeCat that contains the Id and, perhaps, the Name. This also facilitates the persistence around the Home AR since the HomeRepository will be responsible for persistence of both Home and HomeCat.
I must admit that this is where an Entity such as Cat becomes somewhat of a weird thing when it is contained in more than one AR. However, you would still not have the cat repository in the home object but rather have the HomeRepository make use of the CatRepository when retrieving the relevant home instance.

As Java is based on references, you could keep two collections without any serious harm.
What you need is to assure is that every cat on your home is also in the "world". The problem here is if the cats on your home change a lot, you would need to make several lookups, so you should choose a data structure data enables this as fast as you need (i am thinking hashMaps..)
using two collections will enable you to find cats in your home as fast as you can do. however sync-ing the collections might be a problem. in this case you can think about the observer pattern: home observes the world, if a cat dies, it check if it was inside home, and deletes...
there is a lot of way to do what you asked, all you need to do is think about what is the operations with higher frequency, and your need in general. if the collections are small, so no problem on having one collection, with lookups to find the cats home...

If Cat needs to be an aggregate root of its own aggregate accessible by a CatRepository, then it should not be included in the Home aggregate. Your Home entity should reference the associated Cat entities by identity and not by reference.
This is a question of aggregate design and requires a closer look at your domain and how you need to use your entities. One question you could ask yourself is "if I delete Home, should all Cat entities be deleted as well?" Do not put too much emphasis on this question. There are other important factors that need to be considered.
Vaughn Vernon covers this topic in his three-part PDF series Effective Aggregate Design.

It's hard to answer DDD questions with fictional domains (or at least very little information about it), since DDD is all about modeling the domain just the way it is and maintaining the integrity of it's invariants.
As far as I can tell, you do not have any invariant that applies to the relationship between Home and Cat, therefore you do not need a collection of Cat objects within Home's consistency boundary. You could simply have two aggregate roots (Home and Cat) and Cat would reference it's Home by identity.
You can use the CatRepository to query all cats from a specific home or all cats independently of their home.
It's important not to put artificial constraints in the model. For instance, if you hold a collection of Cat identities inside Home for no other reason than maintaining the relationship, then you are limiting the scalability of your system: two simultaneous transactions trying to associate a new Cat to the same Home will fail with a concurrency exception for no reason (assuming optimistic concurrency).

Related

Aggregate root reference in another aggregate root

I currently have two aggregate roots - Customer and AddressBook. Both have some invariants that need to be protected. Customer has reference to AddressBook and I am not sure whether that is the correct way to model my domain because one cannot live without the other and since domain objects should be created using factories I feel like I should not allow creation of Customer without AddressBook and vice versa but obviously one needs to be created before the other. Hope it makes sense.
How should I address my problem?
Other question would be: can we create multiple aggregate roots in a single transaction? I've red that it should not be done in case of update.

I currently have two aggregate roots - Customer and AddressBook. Both have some invariants that need to be protected. Customer has reference to AddressBook and I am not sure whether that is the correct way to model my domain because one cannot live without the other
If they really don't make sense without the other, you may want to review the design to see if they are really part of the same consistency boundary.
Can we create multiple aggregate roots in a single transaction?
Technically, yes. It may not be a good idea.
When all of the logically distinct aggregates are stored together, then creating them in a single transaction is straightforward.
But that also introduces a constraint: that those aggregates need to be stored "together". If all of your aggregates are in the same relational database, an all or nothing transaction is not going to be a problem. On the other hand, if each aggregate is persisted into a document store, then you need a store that allows you to insert multiple documents in the same write.
And if your aggregates should happen to be stored in different document stores, then coordinating the writes becomes even more difficult.
I would like to create closely associated AddressBook with him.... Maybe a domain event would be a more suitable option?
Perhaps; using a domain event to signal a handler to invoke another transaction is a common pattern for automating work. See Evolving Business Processes a la Lokad for a good introduction to process managers.

Create aggregate root in the context of another aggregate root

i'm currently struggling with the creation of instances in the ddd context.
i have read and searched alot and sometimes thought that i have found the answer only to realize that it doesnt feel right while programming it.
This is my situation:
I have two aggregate roots Scenarioand Step. I made those AR
because they encapsulate related elements of the domain and each AR
should be in a consistent state.
Multiple Steps can exist in the
context of a Scenario. They can not exist on their own.
The "name/natural id" of each Step in the context of its Scenario has to be unique. Changes in Scenario do not automatically influence its Steps and
vice versa (e.g. Step doesnt care if Scenario changes some
descriptions or images).
Different Steps of a Scenario can be used, edited, etc. at the same time.
At the moment, each Step holds a reference to its Scenario by the corresponding natural identifier. The Scenario class doesnt know anything about its Steps, so it does not hold a collection with Step references.
How would i create a new Stepfor a given Scenario?
Should i load the Scenario and call something like
createNewStep(...) on it? That would not enforce the uniqueness
constraint (that is in fact a business constraint and not a
technical one), because Scenario doesnt know about its Steps. I would probably have to go with some kind of a "disconnected domain model" then or pass a repsoitory or service to the method to perform the checks.
Should i use a domain service that enforces the constraint, queries the repository, and finally creates and returns the Step?
Should Scenario simply know about its Steps? I think i would like to avoid this one, since that would create a ugly-to-maintain bidirectional relationship.
One could imagine other use cases like a Step shall be classified by options that are provided by the specific Scenario. In this case and if there would be no constraints regarding the "collection" of Steps, i would probably go with the first "solution". Then again: if the classification is changed afterwards, the access to the scenario would be necessary to check for the allowed classifications. That brings me to a possible 4th solution:
Using some kind of "combination" of some possible solutions. Would it be a good idea to create the domain service (accessing everything needed) and use it as an argument of the method that needs it? The method would then call the service where needed and the "domain logic" stays in the entity/model.
Thank you in advance!
I'll just edit instead of copy paste answering ;)
Thank you all for your responses! :)
Pushing the steps back into the scenario would lead to some pretty big objects which i'm trying to avoid (the current running application really suffers from this). It seems that its pretty much alike the Scrum-Example of Vaughns "Effective Aggregate Design" where he is using DomainServices to get smaller aggregates (i really dont know why i'm so uncertain about using domain services). Looks like i'll have to use domainservices or split the aggregates up into "StepName" and "StepDetails" as suggested.

For background, you should read what Greg Young has to say about set validation (via WaybackMachine). In particular, you really need to evaluate, in the context of your solution, what is the business impact of having a failure?
Accept the failure and escalate is by far your easiest option. In what follows, I assume that the business impact of the failure is large, so we need to prevent it from happening.
The "name/natural id" of each Step in the context of its Scenario has to be unique
That's a classic set validation concern.
The first thing to do is challenge the assumptions in your model
Is your model the book of record for "name"? If your model isn't the authority, you have to be very cautious about introducing constraints. Understanding the boundaries of your model's authority is really important.
Is there an invariant that couples the name of a step to any other part of its state? Aggregate design discipline says that two pieces of state coupled by an invariant need to be in the same aggregate, but its silent about properties that don't participate in an invariant.
Is it reasonable to reject a name change while accepting other changes to a step? This is really a variation of the previous -- can tasks be split into two different commands (one involving name, one not) that can succeed or fail independently?
In short, the invariant may be telling you that "step name", as a piece of state, belongs in the scenario aggregate rather than in the step aggregate.
If you think about the problem from the perspective of a relational model, we're looking at a tuple (scenarioId, name, stepId), and the constraint says that (scenarioId, name) form a unique key. That's a hint that step name belongs to the scenario. In code, that signature looks like a scenario data structure that includes a Map<ScenarioName, ScenarioId>.
That won't necessarily solve all of your problems of course, but it is a step toward aligning the model with your actual business.
When that doesn't work...
The "real" answer is to move the step entity back into the scenario aggregate. One way to think about it is this -- all of the entities taken together form "the model" that we are keeping consistent. The aggregates aren't part of the business, per se; they are artificial, independent subdivisions within the model -- we identify and isolate aggregates as a performance optimization; we can perform concurrent edits, and evaluate the validity of a command while loading a much smaller data set.
If the failures make the performance optimization too expensive, you take it out. So you can see that we have an estimate, of sorts, for what it means that the business impact is "large"; it needs to be bigger than the savings we get from using aggregates on the happy path.
Another possibility is to shift where you enforce the invariant. Relational databases are really really good at set validation. So maybe the right answer is to split the enforcement concern: put the invariant into your schema as a constraint, and ignore that constraint in code.
This isn't ideal for a number of reasons -- you've effectively "hidden" the constraint, you've introduced a constraint on the kind of data store that you use for your aggregates, you've introduced a constraint that requires that you store your step aggregates in the same database as the scenario they belong to, and so on. If you squint, you'll see that this is really just the "make the step entities part of the scenario" solution, but in disguise.
But keep in mind: part of the point of domain-driven-design is that we can push back on the business when the code is telling us that the business model itself is wrong. Where's the cost benefit analysis?
Here's the thing about uniqueness constraints: the model enforces uniqueness, not correctness. Imagine a data race, two different commands that each claim the same "name" for a different step in the scenario -- perhaps caused by a data entry error. The model, presumably, can't tell which command is "right", so it's going to make some arbitrary guess (most likely, first command wins). If the model guesses wrong, it has effectively blocked the client that provided correct data!
In cases where the model is the authority, uniqueness constraints can make sense -- the SeatMap aggregate can enforce the constraint that only one ticket can be assigned to a seat at any given time, because it is the authority for assignment.

DDD structure example

I am trying to structure an application using DDD and onion/hexagonal/clean architecture (using Java and Spring). I find it easier to find guidance on the concepts themselves than actually how to implement them. DDD in particular seems rather tricky to find examples that are instructive because each problem is unique. I have seen numerous examples on SO that have been helpful but I still have questions. I wonder whether going through my example would help me and anyone else.
I hope you can forgive me asking more than one question here. The example seems too big for it to make sense me repeating it in multiple questions.
Context:
We have an application that should display information about soccer stats and has the following concepts (for simplicity I have not included all attributes):
Team, which has many Players.
Player.
Fixture, which has 2 Teams and 2 Halves.
Half, which has 2 FormationsPlayed and many Combinations.
FormationPlayed, which has many PositionsPlayed.
PositionPlayed, which has 1 Player and a position value object.
Combination, which can be of 2 types, and has many Moves.
Move, which can be of 2 types, has 1 Player and an event value object.
As you can imagine, trying to work out which things are aggregate roots here is tricky.
Team can exist independently so is an AR.
Player can exist independently so is an AR.
Fixture, when deleted, must also delete its Halves, so is an AR.
Half must be an entity in Fixture.
FormationPlayed must be deleted when a half is deleted, so perhaps this should be an entity in Half.
PositionPlayed must be deleted when a Formation is deleted, so believe this should be an entity in FormationPlayed.
Combination in a sense can exist independently, though is tied to a particular game half. Perhaps this could be an AR tied by eventual consistency.
Move must be deleted when a Combination is deleted, so believe this should be an entity in Combination.
Questions:
Do you see any errors in the above design? If so what would you change?
The Fixture - Half - FormationPlayed - PositionPlayed aggregate seems too large, so I wonder whether you would agree that this could be split into Fixture - Half and FormationPlayed - PositionPlayed using eventual consistency. The thing I can't find an example of is how this is implemented in Java? If Fixture were deleted, would you fire a FixtureDeleted event that causes its corresponding FormationPlayed entities to also be deleted?
I want to construct a domain model that has no understanding of the way that it will be persisted (as per onion architecture). My understanding is that domain entities here should not have surrogate keys because this relates to persistence. I also believe that entities should only reference entities in other aggregates by ids. How then, for example, would PositionPlayed reference Player in the domain model?
Initially the aim is only to allow the client to get the data and display it. Ultimately I want clients to be able to perform CRUD themselves, and I want all invariants to be held together by the domain model when this happens. Would it simplify things (and can you show me or point me to example explaining how) to have two domain models, one simple for data retrieval and one rich for the operations to be performed later? Two BCs, as it were. The reason I ask is that a rich domain model seems rather time consuming to come up with when initially we only want to display stats in the database, but I also don't want to create trouble for myself down the line if it is better to create one rich domain model now in view of the usecases envisioned later. I wonder, if I were to create a simpler model for data retrieval only, which concepts in DDD could be ignored (would I still need to break up large aggregates, for example?)
I hope this all makes sense. Obviously happy to explain further if needed. Realise I'm asking a lot here and I may have confused some ideas. Any answers and wisdom you can give to this would be greatly appreciated !

Do you see any errors in the above design? If so what would you change?
There might be a big one: is your system the book of record? or is it just keeping track of events that happen in the "real world". In a sense, the point of aggregates is to ensure that the book of record is internally consistent, but if you aren't the book of record....
For an example of what I mean
http://www.soccerstats.com/ -- the book of record is the real world.
https://www.easports.com/fifa -- the games are played in the computer
If Fixture were deleted, would you fire a FixtureDeleted event that causes its corresponding FormationPlayed entities to also be deleted?
Udi Dahan wrote: Don't Delete, Just Don't. If an entity has a lifecycle, and that lifecycle has an end, then you mark it, but you don't remove the entity.
I want to construct a domain model that has no understanding of the way that it will be persisted (as per onion architecture)
Great! Be warned, a lot of the examples that you will find online don't get this part right -- for historical reasons, many demonstrations of model are tightly coupled to the side effects that they have on persistence.
My understanding is that domain entities here should not have surrogate keys because this relates to persistence. I also believe that entities should only reference entities in other aggregates by ids. How then, for example, would PositionPlayed reference Player in the domain model?
Ah -- OK, this one is fun. Don't confuse surrogate keys used in the persistence layer with identifiers in the domain model. For instance, when I look at my purchasing history on Amazon, each of my orders (presumably an aggregate) has an ORDER # associated with it. That would imply that the domain level knows about OrderNumber as a value type. The persistence solution in the back end might introduce surrogate keys when storing that data, but those keys are not used by the model.
Note that's I've chosen an example where the aggregate is clearly the authority -- the order only really exists within the model. When the real world is the book of record, you often don't have a unique identifier available (what is Lionel Messi's PlayerId?)
The reason I ask is that a rich domain model seems rather time consuming to come up with when initially we only want to display stats in the database
A couple of thoughts on this -- ddd is usually saved for more complicated use cases (Greg Young: "is this where you get a competitive advantage?"). Most of the power of aggregates comes from the fact that they ensure the consistency of changes of state. When your real problem is data entry and reporting, it tends to be overkill.
Detection and remediation of inconsistencies is often easier/cheaper than trying to get prevention right; and may be satisfactory to the business, given the costs. Something to keep in mind.
The application is keeping track of events in the real world. At the moment, they are recorded manually in a database. Can you be explicit why you believe the distinction is important?
Very roughly -- events indicate things that have already happened. It's too late for the domain to veto them; the real world is outside of the domain's control.
Furthermore, we have to keep in mind that, since the real world is the book of record, things may have happened in the real world that our domain model doesn't know about yet (the reporting of events may be delayed, lost, reordered, and so on).
Aggregates are supposed to be a source of truth. Which means that they can only govern entities in the digital world.
One kind of information resource that you could create is a report of Messi's goals in a season. So every time a goal is reported, you run a command to update the report aggregate. That's not anemic -- not exactly -- but it's not very interesting. It's really just a view (in CQRS terms, it's a read model) that you can recreate from the history of events. It doesn't have any intelligence in it.
The interest aggregates are those that make decisions for themselves, based on the information that they are given.
A contrived example of an aggregate would be one that, if a player scores more than 10 goals in a season, orders that players jersey for you. Notice that while "goals" are something already present in your event stream, the business rule doesn't. That's purely a domain model thing.
So the way that this would work is that each time a goal event appeared, you would load the JerseyPerchasing aggregate, and tell it about the goal. And that aggregate would make sure that this was a new goal (not one that had previously been reported), and determine if the number of goals called for ordering a shirt, check to see if the order for the shirt had already been placed.
Key idea here -- the goals are something that the aggregate is told about. The decision to purchase a jersey is made by the aggregate, and shared with the world.
Later, you realize that sometimes a player gets traded, and then scores a 10th goal. And you have to determine as a business whether that means you get one shirt (which?) or one shirt for each jersey, or maybe you only order jerseys if he scored 10 goals for a specific team in a season. All of this logic goes into the aggregate.
a domain model as per onion architecture that, can you point me to any good examples?
Best place to look, as weird as it sounds, is among the functional programming types. Mark Seemann's blog includes a lot of important ideas that will help here.
The main idea to keep in mind that the model sits at the bottom. The app passes state to the model, and gets state back (in CQS terminology, you query the model). The app is responsible for sharing the results obtained from the model with the persistence component.
do you believe the accepted view would be that an anaemic model should be adopted for a domain this size
In the case where you are just re-organizing information from the real world for easier consumption? Yeah - load document, update document, store document makes a lot more sense to me than going overboard with a bunch of aggregate modeling. But don't read too much into that -- I don't know more about your model than what you have written here. If there's real business complexity in how you evaluate the information from the real world, then the answer would be different.

ORMLite - force read objects to have same identity

I'm reading a hierarchy of objects with ORMLite. It is shaped like a tree, parents have a #ForeignCollection of 0+ children, and every child refers to its parent with #DatabaseField(foreign=true). I'm reading and saving the whole hierarchy at once.
As I'm new to ORM in general, and to ORMLite as well, I didn't know that when objects with the same ID in the database are read, they won't be created as the actually same object with the same Identity, but as several duplicates having the same ID. Meaning, I'm now facing the problem that (let's say "->" stands for "refers to") A -> B -> C != C -> B -> A.
I was thinking to solve the problem by manually reading them through the provided DAOs and puting them together by their ID, assuring that objects with the same ID have the same identity.
Is there are ORMLite-native way of solving this? If yes, what is it, if not, what are common ways of solving this problem? Is this a general problem of ORM? Does it have a name (I'd like to learn more about it)?
Edit:
My hierarchy is so that one building contains several floors, where each floor knows its building, and each floor contains several zones, where every zone knows its floor.

Is this a general problem of ORM? Does it have a name (I'd like to learn more about it)?
It is a general pattern for ORMs and is called “Identity Map”: within a session, no matter where in your code you got a mapped object from the ORM, there will be only one object representing a specific line in the db (i.e. having it’s primary key).
I love this pattern: you can retrieve something from the db in one part of your code, even do modifications to it, store that object in a instance variable, etc... And in another part of the code, if you get hold of an object for the same “db row” (by whatever means: you got it passed as a argument, you made a bulk query to the db, you created a “new” mapped object with the primary key set to the same and add it to the session), you will end up with the same object. – Even the modifications from before (including unflushed) will be there.
(adding an mapped object to the session may fail because of this, and depending on the ORM and programming language this adding may give you another object back as “the same”)

Unfortunately there is not a ORMLite-native way of solving this problem. More complex ORM systems (such as Hibernate) have caching layers which are there specifically for this reason. ORMLite does not have a cache layer so it doesn't know that it just returned an object with the same id "recently". Here's documentation of Hibernate caching:
http://docs.jboss.org/hibernate/core/3.3/reference/en/html/performance.html
However, ORMLite is designed to be Lite and cache layers violate that designation IMO. About the only [unfortunate] solution that I see to your issue in ORMLite is to do what you are doing -- rebuilding the object tree based on the ids. If you give more details about your hierarchy we may be able to help more specifically.
So after thinking about your case a bit more #riwi, it occurred to me that if you have a Building that contains a collection of Floors, there is no reason why the Building object on each of the Floors in the collection cannot be set with the parent Building object. Duh. ORMLite has all of the information it needs to make this happen. I implemented this behavior and it was released in version 4.24.
Edit:
As of ORMLite version 4.26 we added an initial take on an object-cache that can support the requested features asked for. Here are the docs:
http://ormlite.com/docs/object-cache

Hiding deleted objects

I have the following use case: There's a class called Template and with that class I can create instances of the ActualObject class (ActualObject copies its inital data from the Template). The Template class has a list of Product:s.
Now here comes the tricky part, the user should be able to delete Products from the database but these deletions may not affect the content of a Template. In other words, even if a Product is deleted, the Template should still have access to it. This could be solved by adding a flag "deleted" to the Product. If a Product is deleted, then it may not be searched explicitly from the database, but it can be fetched implicitly (for example via the reference in the Template class).
The idea behind this is that when an ActualObject is created from a template, the user is notified in the user interface that "The Template X had a Product Z with the parameters A, B and C, but this product has been deleted and cannot be added as such in ActualObject Z".
My problem is how I should mark these deleted objects as deleted. Before someone suggests that just update the delete flag instead of doing an actual delete query, my problem is not that simple. The delete flag and its behaviour should exist in all POJOs, not just in Product. This means I'll be getting cascade problems. For example, if I delete a Template, then the Products should also be deleted and each Product has a reference to a Price-object which also should be deleted and each Price may have a reference to a VAT-object and so forth. All these cascaded objects should be marked as deleted.
My question is how can I accomplish this in a sensible manner. Going through every object (which are being deleted) checking each field for references which should be deleted, going through their references etc is quite laborious and bugs are easy to slip in.
I'm using Hibernate, I was wondering if Hibernate would have any such inbuilt features. Another idea that I came to think of was to use hibernate interceptors to modify an actual SQL delete query to an update query (I'm not even 100% sure this is possible). My only concern is that does Hibernate rely on cascades in foreign keys, in other words, the cascaded deletes are done by the database and not by hibernate.

My problem is how I should mark these
deleted objects as deleted.
I think you have choosen a very complex way to solve the task. It would be more easy to introduce ProductTemplate. Place into this object all required properties you need. And also you need here a reference to a Product instance. Than instead of marking Product you can just delete it (and delete all other entities, such as prices). And, of course, you should clean reference in ProductTemplate. When you are creating an instance of ActualObject you will be able to notify the user with appropriate message.

I think you're trying to make things much more complicated than they should be... anyway, what you're trying to do is handling Hibernate events, take a look at Chapter 12 of Hibernate Reference, you can choose to use interceptors or the event system.
In any case... well good luck :)

public interface Deletable {
public void delete();
}
Have all your deletable objects implement this interface. In their implementations, update the deleted flag and have them call their children's delete() method also - which implies that the children must be Deletable too.
Of course, upon implementation you'll have to manually figure which children are Deletable. But this should be straightforward, at least.

If I understand what you are asking for, you add an #OneToMany relationship between the template and the product, and select your cascade rules, you will be able to delete all associated products for a given template. In your product class, you can add the "deleted" flag as you suggested. This deleted flag would be leveraged by your service/dao layer e.g. you could leverage a getProdcuts(boolean includeDeleted) type concept to determine if you should include the "deleted" records for return. In this fashion you can control what end users see, but still expose full functionality to internal business users.

The flag to delete should be a part of the Template Class itself. That way all the Objects that you create have a way to be flagged as alive or deleted. The marking of the Object to be deleted, should go higher up to the base class.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.