Composition in UML

Composition in UML - java

In UML diagrams when considering composition. Should we use it in logical or implementation sense. Examples for both terms:
Implementation - An airport will contain a reference to the country. In other words, a country is part of the Airport.
Logical - A country can have zero or many airports. In other words, an airport is part of the country.
From diagram above, which case shows correct usage of composition?
NOTE: if neither of these cases are good, please suggest other ways to show relationship between country and airport.

I think that this is not a composition in the strong "UML sense" of that word.
From wikipedia:
The relationship between the composite and the component is a strong “has a” relationship, as the composite object takes ownership of the component. This means the composite is responsible for the creation and destruction of the component parts.
An Airport does not create countries (and in IT sense, a "country" object is also not responsible for providing/creating "airport" objects).
In that sense, you are looking towards an association here; and I think the first one fits better (talking in "general"). But the core aspect is: your model has to express the specific requirements of your domain. In other words: both solutions are valid; it very much depends on the context which one to choose. So, pick the one that helps you solving your problem!

This would depend on the ultimate model you're trying to build.
If your model includes countries as first-class objects or entities then clearly a country contains airports. If Country is an attribute of an airport (more properly an attribute of the airport's location), then model it as such.
Unless you have a good reason to model countries as entities I'd go with the attribute, since borders can shift and airports can change countries.
In other words, this question doesn't have a definite answer, either can work depending on your ultimate goals.

If I can suggest :
cf. Section 9.5.3 of UML specification (march 2015) :
Composite : Indicates that the Property is aggregated compositely, i.e., the composite object has responsibility for the existence and storage of the composed objects (see the definition of parts in 11.2.3).
and
Composite aggregation is a strong form of aggregation that requires a part object be included in at most one composite
object at a time. If a composite object is deleted, all of its part instances that are objects are deleted with it.
For me, the question of implementation and logical is more a question about db model. But maybe I am wrong.
And If I have to do a choice I will design the implementation case only.

You can say that airport is a part of a country or in other words is responsible for it. So from business perspective the second approach is correct. To indicate the "knowledge" about the country from airport perspective you can add an open arrow pointing in the country direction just before the diamond.
Regardless of the approach what exactly you want to model the first diagram is incorrect. With this diagram each airport will be responsible for one country so each airport will be in a different country.

Related

Create aggregate root in the context of another aggregate root

i'm currently struggling with the creation of instances in the ddd context.
i have read and searched alot and sometimes thought that i have found the answer only to realize that it doesnt feel right while programming it.
This is my situation:
I have two aggregate roots Scenarioand Step. I made those AR
because they encapsulate related elements of the domain and each AR
should be in a consistent state.
Multiple Steps can exist in the
context of a Scenario. They can not exist on their own.
The "name/natural id" of each Step in the context of its Scenario has to be unique. Changes in Scenario do not automatically influence its Steps and
vice versa (e.g. Step doesnt care if Scenario changes some
descriptions or images).
Different Steps of a Scenario can be used, edited, etc. at the same time.
At the moment, each Step holds a reference to its Scenario by the corresponding natural identifier. The Scenario class doesnt know anything about its Steps, so it does not hold a collection with Step references.
How would i create a new Stepfor a given Scenario?
Should i load the Scenario and call something like
createNewStep(...) on it? That would not enforce the uniqueness
constraint (that is in fact a business constraint and not a
technical one), because Scenario doesnt know about its Steps. I would probably have to go with some kind of a "disconnected domain model" then or pass a repsoitory or service to the method to perform the checks.
Should i use a domain service that enforces the constraint, queries the repository, and finally creates and returns the Step?
Should Scenario simply know about its Steps? I think i would like to avoid this one, since that would create a ugly-to-maintain bidirectional relationship.
One could imagine other use cases like a Step shall be classified by options that are provided by the specific Scenario. In this case and if there would be no constraints regarding the "collection" of Steps, i would probably go with the first "solution". Then again: if the classification is changed afterwards, the access to the scenario would be necessary to check for the allowed classifications. That brings me to a possible 4th solution:
Using some kind of "combination" of some possible solutions. Would it be a good idea to create the domain service (accessing everything needed) and use it as an argument of the method that needs it? The method would then call the service where needed and the "domain logic" stays in the entity/model.
Thank you in advance!
I'll just edit instead of copy paste answering ;)
Thank you all for your responses! :)
Pushing the steps back into the scenario would lead to some pretty big objects which i'm trying to avoid (the current running application really suffers from this). It seems that its pretty much alike the Scrum-Example of Vaughns "Effective Aggregate Design" where he is using DomainServices to get smaller aggregates (i really dont know why i'm so uncertain about using domain services). Looks like i'll have to use domainservices or split the aggregates up into "StepName" and "StepDetails" as suggested.

For background, you should read what Greg Young has to say about set validation (via WaybackMachine). In particular, you really need to evaluate, in the context of your solution, what is the business impact of having a failure?
Accept the failure and escalate is by far your easiest option. In what follows, I assume that the business impact of the failure is large, so we need to prevent it from happening.
The "name/natural id" of each Step in the context of its Scenario has to be unique
That's a classic set validation concern.
The first thing to do is challenge the assumptions in your model
Is your model the book of record for "name"? If your model isn't the authority, you have to be very cautious about introducing constraints. Understanding the boundaries of your model's authority is really important.
Is there an invariant that couples the name of a step to any other part of its state? Aggregate design discipline says that two pieces of state coupled by an invariant need to be in the same aggregate, but its silent about properties that don't participate in an invariant.
Is it reasonable to reject a name change while accepting other changes to a step? This is really a variation of the previous -- can tasks be split into two different commands (one involving name, one not) that can succeed or fail independently?
In short, the invariant may be telling you that "step name", as a piece of state, belongs in the scenario aggregate rather than in the step aggregate.
If you think about the problem from the perspective of a relational model, we're looking at a tuple (scenarioId, name, stepId), and the constraint says that (scenarioId, name) form a unique key. That's a hint that step name belongs to the scenario. In code, that signature looks like a scenario data structure that includes a Map<ScenarioName, ScenarioId>.
That won't necessarily solve all of your problems of course, but it is a step toward aligning the model with your actual business.
When that doesn't work...
The "real" answer is to move the step entity back into the scenario aggregate. One way to think about it is this -- all of the entities taken together form "the model" that we are keeping consistent. The aggregates aren't part of the business, per se; they are artificial, independent subdivisions within the model -- we identify and isolate aggregates as a performance optimization; we can perform concurrent edits, and evaluate the validity of a command while loading a much smaller data set.
If the failures make the performance optimization too expensive, you take it out. So you can see that we have an estimate, of sorts, for what it means that the business impact is "large"; it needs to be bigger than the savings we get from using aggregates on the happy path.
Another possibility is to shift where you enforce the invariant. Relational databases are really really good at set validation. So maybe the right answer is to split the enforcement concern: put the invariant into your schema as a constraint, and ignore that constraint in code.
This isn't ideal for a number of reasons -- you've effectively "hidden" the constraint, you've introduced a constraint on the kind of data store that you use for your aggregates, you've introduced a constraint that requires that you store your step aggregates in the same database as the scenario they belong to, and so on. If you squint, you'll see that this is really just the "make the step entities part of the scenario" solution, but in disguise.
But keep in mind: part of the point of domain-driven-design is that we can push back on the business when the code is telling us that the business model itself is wrong. Where's the cost benefit analysis?
Here's the thing about uniqueness constraints: the model enforces uniqueness, not correctness. Imagine a data race, two different commands that each claim the same "name" for a different step in the scenario -- perhaps caused by a data entry error. The model, presumably, can't tell which command is "right", so it's going to make some arbitrary guess (most likely, first command wins). If the model guesses wrong, it has effectively blocked the client that provided correct data!
In cases where the model is the authority, uniqueness constraints can make sense -- the SeatMap aggregate can enforce the constraint that only one ticket can be assigned to a seat at any given time, because it is the authority for assignment.

DDD structure example

I am trying to structure an application using DDD and onion/hexagonal/clean architecture (using Java and Spring). I find it easier to find guidance on the concepts themselves than actually how to implement them. DDD in particular seems rather tricky to find examples that are instructive because each problem is unique. I have seen numerous examples on SO that have been helpful but I still have questions. I wonder whether going through my example would help me and anyone else.
I hope you can forgive me asking more than one question here. The example seems too big for it to make sense me repeating it in multiple questions.
Context:
We have an application that should display information about soccer stats and has the following concepts (for simplicity I have not included all attributes):
Team, which has many Players.
Player.
Fixture, which has 2 Teams and 2 Halves.
Half, which has 2 FormationsPlayed and many Combinations.
FormationPlayed, which has many PositionsPlayed.
PositionPlayed, which has 1 Player and a position value object.
Combination, which can be of 2 types, and has many Moves.
Move, which can be of 2 types, has 1 Player and an event value object.
As you can imagine, trying to work out which things are aggregate roots here is tricky.
Team can exist independently so is an AR.
Player can exist independently so is an AR.
Fixture, when deleted, must also delete its Halves, so is an AR.
Half must be an entity in Fixture.
FormationPlayed must be deleted when a half is deleted, so perhaps this should be an entity in Half.
PositionPlayed must be deleted when a Formation is deleted, so believe this should be an entity in FormationPlayed.
Combination in a sense can exist independently, though is tied to a particular game half. Perhaps this could be an AR tied by eventual consistency.
Move must be deleted when a Combination is deleted, so believe this should be an entity in Combination.
Questions:
Do you see any errors in the above design? If so what would you change?
The Fixture - Half - FormationPlayed - PositionPlayed aggregate seems too large, so I wonder whether you would agree that this could be split into Fixture - Half and FormationPlayed - PositionPlayed using eventual consistency. The thing I can't find an example of is how this is implemented in Java? If Fixture were deleted, would you fire a FixtureDeleted event that causes its corresponding FormationPlayed entities to also be deleted?
I want to construct a domain model that has no understanding of the way that it will be persisted (as per onion architecture). My understanding is that domain entities here should not have surrogate keys because this relates to persistence. I also believe that entities should only reference entities in other aggregates by ids. How then, for example, would PositionPlayed reference Player in the domain model?
Initially the aim is only to allow the client to get the data and display it. Ultimately I want clients to be able to perform CRUD themselves, and I want all invariants to be held together by the domain model when this happens. Would it simplify things (and can you show me or point me to example explaining how) to have two domain models, one simple for data retrieval and one rich for the operations to be performed later? Two BCs, as it were. The reason I ask is that a rich domain model seems rather time consuming to come up with when initially we only want to display stats in the database, but I also don't want to create trouble for myself down the line if it is better to create one rich domain model now in view of the usecases envisioned later. I wonder, if I were to create a simpler model for data retrieval only, which concepts in DDD could be ignored (would I still need to break up large aggregates, for example?)
I hope this all makes sense. Obviously happy to explain further if needed. Realise I'm asking a lot here and I may have confused some ideas. Any answers and wisdom you can give to this would be greatly appreciated !

Do you see any errors in the above design? If so what would you change?
There might be a big one: is your system the book of record? or is it just keeping track of events that happen in the "real world". In a sense, the point of aggregates is to ensure that the book of record is internally consistent, but if you aren't the book of record....
For an example of what I mean
http://www.soccerstats.com/ -- the book of record is the real world.
https://www.easports.com/fifa -- the games are played in the computer
If Fixture were deleted, would you fire a FixtureDeleted event that causes its corresponding FormationPlayed entities to also be deleted?
Udi Dahan wrote: Don't Delete, Just Don't. If an entity has a lifecycle, and that lifecycle has an end, then you mark it, but you don't remove the entity.
I want to construct a domain model that has no understanding of the way that it will be persisted (as per onion architecture)
Great! Be warned, a lot of the examples that you will find online don't get this part right -- for historical reasons, many demonstrations of model are tightly coupled to the side effects that they have on persistence.
My understanding is that domain entities here should not have surrogate keys because this relates to persistence. I also believe that entities should only reference entities in other aggregates by ids. How then, for example, would PositionPlayed reference Player in the domain model?
Ah -- OK, this one is fun. Don't confuse surrogate keys used in the persistence layer with identifiers in the domain model. For instance, when I look at my purchasing history on Amazon, each of my orders (presumably an aggregate) has an ORDER # associated with it. That would imply that the domain level knows about OrderNumber as a value type. The persistence solution in the back end might introduce surrogate keys when storing that data, but those keys are not used by the model.
Note that's I've chosen an example where the aggregate is clearly the authority -- the order only really exists within the model. When the real world is the book of record, you often don't have a unique identifier available (what is Lionel Messi's PlayerId?)
The reason I ask is that a rich domain model seems rather time consuming to come up with when initially we only want to display stats in the database
A couple of thoughts on this -- ddd is usually saved for more complicated use cases (Greg Young: "is this where you get a competitive advantage?"). Most of the power of aggregates comes from the fact that they ensure the consistency of changes of state. When your real problem is data entry and reporting, it tends to be overkill.
Detection and remediation of inconsistencies is often easier/cheaper than trying to get prevention right; and may be satisfactory to the business, given the costs. Something to keep in mind.
The application is keeping track of events in the real world. At the moment, they are recorded manually in a database. Can you be explicit why you believe the distinction is important?
Very roughly -- events indicate things that have already happened. It's too late for the domain to veto them; the real world is outside of the domain's control.
Furthermore, we have to keep in mind that, since the real world is the book of record, things may have happened in the real world that our domain model doesn't know about yet (the reporting of events may be delayed, lost, reordered, and so on).
Aggregates are supposed to be a source of truth. Which means that they can only govern entities in the digital world.
One kind of information resource that you could create is a report of Messi's goals in a season. So every time a goal is reported, you run a command to update the report aggregate. That's not anemic -- not exactly -- but it's not very interesting. It's really just a view (in CQRS terms, it's a read model) that you can recreate from the history of events. It doesn't have any intelligence in it.
The interest aggregates are those that make decisions for themselves, based on the information that they are given.
A contrived example of an aggregate would be one that, if a player scores more than 10 goals in a season, orders that players jersey for you. Notice that while "goals" are something already present in your event stream, the business rule doesn't. That's purely a domain model thing.
So the way that this would work is that each time a goal event appeared, you would load the JerseyPerchasing aggregate, and tell it about the goal. And that aggregate would make sure that this was a new goal (not one that had previously been reported), and determine if the number of goals called for ordering a shirt, check to see if the order for the shirt had already been placed.
Key idea here -- the goals are something that the aggregate is told about. The decision to purchase a jersey is made by the aggregate, and shared with the world.
Later, you realize that sometimes a player gets traded, and then scores a 10th goal. And you have to determine as a business whether that means you get one shirt (which?) or one shirt for each jersey, or maybe you only order jerseys if he scored 10 goals for a specific team in a season. All of this logic goes into the aggregate.
a domain model as per onion architecture that, can you point me to any good examples?
Best place to look, as weird as it sounds, is among the functional programming types. Mark Seemann's blog includes a lot of important ideas that will help here.
The main idea to keep in mind that the model sits at the bottom. The app passes state to the model, and gets state back (in CQS terminology, you query the model). The app is responsible for sharing the results obtained from the model with the persistence component.
do you believe the accepted view would be that an anaemic model should be adopted for a domain this size
In the case where you are just re-organizing information from the real world for easier consumption? Yeah - load document, update document, store document makes a lot more sense to me than going overboard with a bunch of aggregate modeling. But don't read too much into that -- I don't know more about your model than what you have written here. If there's real business complexity in how you evaluate the information from the real world, then the answer would be different.

Should entity hold reference to repository?

Suppose we have class Home and we want to have collection of all Cats inside this home, but also we want to have general repository of Cats that has all the cats available in the world. Should Home hold the reference to specific repository (or maybe collection) of Cats, or should I just make another lookup in general repository?

From a domain-driven design perspective you shouldn't have one aggregate root (AR) instance contained in another AR instance and typically one also would not have a reference to a repository in any entity.
So if Home and Cat are both ARs then Home should contain only a list of Cat Ids or a list of value objects (VO) that each represent a cat, e.g. HomeCat that contains the Id and, perhaps, the Name. This also facilitates the persistence around the Home AR since the HomeRepository will be responsible for persistence of both Home and HomeCat.
I must admit that this is where an Entity such as Cat becomes somewhat of a weird thing when it is contained in more than one AR. However, you would still not have the cat repository in the home object but rather have the HomeRepository make use of the CatRepository when retrieving the relevant home instance.

As Java is based on references, you could keep two collections without any serious harm.
What you need is to assure is that every cat on your home is also in the "world". The problem here is if the cats on your home change a lot, you would need to make several lookups, so you should choose a data structure data enables this as fast as you need (i am thinking hashMaps..)
using two collections will enable you to find cats in your home as fast as you can do. however sync-ing the collections might be a problem. in this case you can think about the observer pattern: home observes the world, if a cat dies, it check if it was inside home, and deletes...
there is a lot of way to do what you asked, all you need to do is think about what is the operations with higher frequency, and your need in general. if the collections are small, so no problem on having one collection, with lookups to find the cats home...

If Cat needs to be an aggregate root of its own aggregate accessible by a CatRepository, then it should not be included in the Home aggregate. Your Home entity should reference the associated Cat entities by identity and not by reference.
This is a question of aggregate design and requires a closer look at your domain and how you need to use your entities. One question you could ask yourself is "if I delete Home, should all Cat entities be deleted as well?" Do not put too much emphasis on this question. There are other important factors that need to be considered.
Vaughn Vernon covers this topic in his three-part PDF series Effective Aggregate Design.

It's hard to answer DDD questions with fictional domains (or at least very little information about it), since DDD is all about modeling the domain just the way it is and maintaining the integrity of it's invariants.
As far as I can tell, you do not have any invariant that applies to the relationship between Home and Cat, therefore you do not need a collection of Cat objects within Home's consistency boundary. You could simply have two aggregate roots (Home and Cat) and Cat would reference it's Home by identity.
You can use the CatRepository to query all cats from a specific home or all cats independently of their home.
It's important not to put artificial constraints in the model. For instance, if you hold a collection of Cat identities inside Home for no other reason than maintaining the relationship, then you are limiting the scalability of your system: two simultaneous transactions trying to associate a new Cat to the same Home will fail with a concurrency exception for no reason (assuming optimistic concurrency).

Why to use Hibernate Mapping Component?

I am learning hibernate an I came across Hibernate Mapping Component.
Why should we use it if we can have the same pojo class for student and address?

You can. But that doesn't mean you want.
Reason one: you want to model them differently
In objects you want to model something the best possible way. That means one thing are Students and other Addresses. In a future you could have more Address per student, or none, so migration to that model will be easier if you have two differents objects.
Think of it as high cohesion and low coupling (good design patterns). Each class has its meaning, its responsability, its limited range of action. The more isolated classes are, the more punctual changes will be. The more modular your code will be too.
By contrast, in tables you make concessions in order to gain performance and more direct queries. That means you can denormalize your model (like joining students and addresses).
Reason two: legacy models
By example. If you have a legacy single table and want to use two objects, you need this mapping. Or... if your application is already made, based on two objects, but your database is reengineered and you decide one table is better.

One more point is that Address (which is treated as component here) cannot have its own primary key, it uses the primary key of the enclosing Student entity.

How to map this tricky entity/relationship model in Java?

I have little bit confusing many to many relationship between 3 entities. And i want to know how can be my object model look like. I have three Entities, A,B,C and A<->B (M:N) and associate table between both, A and B, is linked with another associate table which make another 1:n relationship with third entity. I have never seen such relationship which make 1:n relationship with another associate table. For further information please have look on following diagram.
Uploaded Image link
If i talk about object model then i will say "INSTANCE_A" has many "INSTANCE_B" instance and vice versa but i do not know how can i summarize relationship for "INSTANCE_C".
Please also let me know whether definition of such relationship between all three entities is right ? i mean is there any problem in relationship design.
Thanks in advance
EDIT: All arrows denote (1:n or m:1) relationship

The data model is correct, but the object model for these tables can be kind of trucky. I'd do something like this:
One class for TBL_A, with a List attribute of TBL_B
One class for TBL_B, with a List attribute of TBL_A
One class for TBL_C_TBL_A_B, with and an attribute for TBL_B, TBL_A and TBL_C
Mapping that in an ORM framework can get funky.

This shall bring you into the right direction. Try to design a UML diagram, or ER should be ok too. Here is some paper with a Model and the corresponding Java-Code for this model http://www.csd.uoc.gr/~hy252/references/UML_for_Java_Programmers-Book.pdf. (Go to -> Class diagrams chapter).

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.