I have two aggregates.
Person {
private personID personID;
private nodeID nodeID; //belongs to node
}
Node {
private nodeID nodeID; //node's id
private nodeID parent; //parent node reference by id
public void assign(Person person);
}
Now I have domain logic for my person assigning service:
Person can be assigned to node "X", only if he belongs to node "Y" which is parent or great-grandfather or great-great-grandfather or... of node "X".
To find out it out I would need to query Read Model.
I am in Domain so I can't just use my Read Model to query it.
I don't think I can just add to my repository, connection to read model, since it's connected to my event store. Specially, when Read Model can be placed at another server and be another application.
What is the proper way to implement it?
The following is a contraint:
Person can be assigned to node "X", only if he belongs to node "Y"
which is parent or great-grandfather or great-great-grandfather or...
of node "X".
If it is a constraint that must be enforced, you can model the hierarchy in a separate aggregate on the write side (e.g., Graph) whose sole purpose is ensuring integrity.
The proper way to do this is to support ancestry checks in the command model. This is where you want to enforce the invariant, so the model needs to support this.
Tree structures often lead to performance problem if you need to be able to make unbounded ancestry checks. So you probably need to implement a performance optimization that improves these kinds of queries.
I see the following possibilities:
Use a data store that directly supports the queries you need. This may be difficult if you want to do ES.
Use snapshotting. This may or may not be feasible depending on your tree structures.
Use caching. This is similar to snapshotting, but stores the information in a cache instead of in the event store.
Use the read model. Be sure you understand the consequences, especially the asynchronous data propagation and the increased complexity. I'd only suggest this as a last resort, but YMMV.
Related
I've seen some DDD projects with value object representations of entities.
They usually appear like EmployeeDetail, EmployeeDescriptor, EmployeeRecord, etc. Sometimes it holds the entity ID, sometimes not.
Is that a pattern? If yes, does it have a name?
What are the use cases?
Are they value objects, parameter objects, or anything else?
Are they referenced in the domain model (as property) or are they "floating" just as parameters and returns of methods?
Going beyond...
I wonder if I can define any aggregate as an ID + BODY (detail, descriptor, etc) + METHODS (behavior).
public class Employee {
private EmployeeID id;
private EmployeeDetail detail; //the "body"
}
Could I design my aggregates like this to avoid code duplication when using this kind of object?
The immediate advantage of doing this is to avoid those methods with too many parameters in the aggregate factory method.
public class Employee {
...
public static Employee from(EmployeeID id, EmployeeDetail detail){...};
}
instead of
public class Employee {
...
public static Employee from(EmployeeID id, + 10 Value Objects here){...};
}
What do you think?
What you're proposing is the idiomatic (via case classes) approach to modeling an aggregate in Scala: you have an ID essentially pointing to a mutable container of an immutable object graph representing the state (and likely some static functions for defining the state transitions). You are moving away from the more traditional OOP conceptions of domain-driven design to the more FP conceptions (come to the dark side... ;) ).
If doing this, you'll typically want to partition the state so that operations on the aggregate will [as] rarely [as possible] change multiple branches of the state, which enables reuse of as much of the previous object graph as possible.
Could I design my aggregates like this to avoid code duplication when using this kind of object?
What you are proposing is representing the entire entity except its id as a 'bulky' value object. A concept or object's place in your domain (finding that involves defining your bounded contexts and their ubiquitous languages) dictates whether it is treated as a value object or an entity, not coding convenience.
However, if you go with your scheme as a general principle, you risk tangling unrelated data into a single value object. That leads to many conceptual and technical difficulties. Take updating an entity for example. Entities are designed to evolve in their lifecycle in response to operations performed on it. Each operation updates only the relevant properties of an entity. With your solution, for any operations, you have to construct a new value object (as value objects are defined to be immutable) as replacement, potentially copying many irrelevant data.
The examples you are citing are most likely entities with only one value object attribute.
OK - great question...
DDD Question Answered
The difference between an entity object and a value object comes down to perspective - and needs for the given situation.
Let's take a simple example...
A airplane flight to your favourite destination has...
Seats 1A, 10B, 21C available for you too book (entities)
3 of 22 Seats available (value object).
The first reflects individually identifiable seat entities that could be filled.
The second reflects that there are 3 seats available (value object).
With value object you are not concerned with which individual entities (seats) are available - just the total number.
It's not difficult to understand that it depends on who's asking and how much it matters.
Some flights you book a seat and others you book a (any) seat on a plane.
General
Ask yourself a question! Do I care about the individual element or the totality?
NB. An entity (plane) can consider seats, identity and / or value object - depending on use case. Also worth noting, it has multiple depends - Cockpit seats are more likely to be entity seats; and passenger seats value objects.
I'm pretty sure I want the pilot seat to have a qualified pilot; and qualified co-pilot; but I don't really care that much where the passengers seats. Well except I want to make sure the emergency exit seats are suitable passengers to help exit the plane in an emergency.
No simple answer, but a complex set of a pieces to thing about, and to consider for each situation and domain complexity.
Hope that explains some bits, happy to answer follow-up questions...
Short version
Why would we ever need factories (being injected in the application layer) in DDD if no aggregates will ever emerge out of thin air and doing so would cause at least an error in the modeling of the business ?
Long version
There is a popular DDD example which is the e-commerce application consisting of the following aggregates and entities (over simplified)
Modeled as
class Customer {
private CustomerId id;
// related business rules and processes
}
class Order{
private OrderId id;
private List<OrderLine> orderLines;
// related business rules and processes
}
class OrderLine{
private OrderLineId id;
private int quantity;
private ProductId product;
// related business rules and processes
}
class Product{}
// etc...
And it's well established that the creation of the order is done through a factory, usually like:
Order order = orderFactory.createNewOrder(customer);
However I'm arguing that this model is not very clear since I assume the original (made up) requirement is
Customers can place orders.
So doesn't it make more sense to delegate the creation of the order to the Customer aggregate and have the code more verbose ? i.e:
Order order = customer.placeOrder(...);
// Pass the data needed for the creation of the object, or even the factory service if the creation is complex
In my opinion, expanding this view would result in that the actors of the system would be aggregates most of the time and they will contain all the invoking of the use cases (which has the side-effect that the application layer being very thin as well)
Does this second approach violate DDD ? An aggregate being responsible for the creation of another aggregate doesn't feel right but produces better code that -in my opinion- matches the domain better.
Does this second approach violate DDD
No. The patterns described by Evans in the Domain Driven Design book should be understood as "useful ideas that recur" rather than "these patterns are mandatory".
And you will find support in the literature for the idea that, when we are modeling the creation of aggregates, we should be using the domain language, not factories. For example: Don't Create Aggregate Roots (Udi Dahan, 2009).
That said.... when Evans describes the FACTORY pattern in his book, he does so in the context of life cycle management, not modeling. In other words, factories are cousins to repositories and aggregates, not domain entities and value objects.
Shift the responsibility for creating instances of complex objects and AGGREGATES to a separate object, which may itself have no responsibility in the domain model but is still a part of the domain design.
In other words, we might still want to use Customer::placeOrder in our domain model, but to have that method delegate the object assembly to a dedicated factory.
Of course, object creation is not the only place that we use the factory pattern; it can also appear in object reconstitution. A common REPOSITORY pattern is to fetch information from the durable data store, and then pass that information to a FACTORY to arrange that information into the appropriate shape - aka the graph of objects that make up the AGGREGATE.
I understand the factory pattern as an example of information hiding, the factory limits the blast radius when we decide to change how a fixed set of information is assembled into an aggregate.
I am currently working on a product that works with Hibernate (HQL) and another one that works with JPQL. As much as I like the concept of the mapping from a relational structure (database) to an object (Java class), I am not convinced of the performance.
EXAMPLE:
Java:
public class Person{
private String name;
private int age;
private char sex;
private List<Person> children;
//...
}
I want to get attribute age of a certain Person. A person with 10 children (he has been very busy). With Hibernate or JPQL you would retrieve the person as an object.
HQL:
SELECT p
FROM my.package.Person as p
WHERE p.name = 'Hazaart'
Not only will I be retrieving the other attributes of the person that I don't need, it will also retrieve all the children of that person and their attributes. And they might have children as well and so on... This would mean more tables would be accessed on database level than needed.
Conclusion:
I understand the advantages of Object Relational Mapping. However it would seem that in a lot of cases you will not need every attribute of a certain object. Especially in a complex system. It would seem like the advantages do not nearly justify the performance loss. I've always learned performance should be the main concern.
Can anyone please share their opinion? Maybe I am looking at it the wrong way, maybe I am using it the wrong way...
I'm not familiar with JPQL, but if you set up Hiernate correctly, it will not automatically fetch the children. Instead it will return a proxy list, which will fetch the missing data transparently if it is accessed.
This will also work with simple references to other persistent objects. Hibernate will create a proxy object, containing only the ID, and load the actual data only if it is accessed. ("lazy loading")
This of couse has some limitations (like persistent class hierarchies), but overall works pretty good.
BTW, you should use List<Person> to reference the children. I'm not sure that Hibernate can use a proxy List if you specify a specific implementation.
Update:
In the example above, Hibernate will load the attributes name, age and sex, and will create a List<Person> proxy object that initially contains no data.
Once the application accesses calls any method of the List that requires knowledge of the data, like childen.size() or iterates over the list, the proxy will call Hibernate to read the children objects and populate the List. The cildren objects, being instances of Person, will also contain a proxy List<Person> of their children.
There are some optimizations hibernate might perform in the background, like loading the children for other Person objects at the same time that might be in this session, since it is querying the database anyways. But whether this is done, and to what extend, is configurable per attribute.
You can also tell hibernate to never use lazy-loading for certain references or classes, if you are sure you'll need them later, or if you continue to use the persistent oject once the session is closed.
Be aware that lazy loading will of course fail if the session is no longer active. If for example you load a Person oject, don't access the children List, and close the session, a call to children.size() for example will fail.
IIRC the hibernate session class has method to populate all not-yet-loaded references in a persistent oject, if needed.
Best read the hibernate documentation on how to configure all this.
One of the key benefits of NoSQL data stores like MongoDB is that they're schemaless. With dynamically typed languages this seem to be a natural fit. You can receive some arbitrary JSON inputs, perform business logic on the known fields, and persist the whole thing without first having to define the object.
What if your choice of language is limited to the statically typed, say Java? How could I achieve the same level of flexibility?
A typical data flow like the following:
JSON Input
Serialize to Java Object to perform business logic
Deserialize into BSON to persist in Mongo
where the serialization to object step is necessary since you want to perform business logic with POJOs, not JSON strings. However, before I can serialize the input into objects, I must define it first. What if the input contains additional fields undefined in the object? While they may not be used in the business logic, I may still want to be able to persist them. I have seem implementations where the undefined fields are put into a map, but am not sure if that's the best approach. For one, the undefined fields may be complex objects as well.
Schemaless data doesn't necessarily mean structureless data; the fields are typically known in advance and some type-safe pattern can be applied on top of it to avoid the Magic Container anti-pattern But this is not always the case. Sometimes keys are entered by the user and cannot be known in advance.
I've used the Role Object Pattern several times to give coherence to a dynamic structure. I think it is well suited here for both cases.
The Role Object Pattern defines a way to access different views of an object. The canonical example being a User that can assume several roles such as Customer, Vendor, and Seller. Each of these views has different operations it can perform and can be accessed from any of the other views. Common fields are typically available at the interface level (especially userId(), or in your case toJson()).
Here's an example of using the pattern:
public void displayPage(User user) {
display(user.getName());
if (user.hasView(Customer.class))
displayShoppingCart(user.getView(Customer.class);
if (user.hasView(Seller.class))
displayProducts(user.getView(Seller.class));
}
In the case of data with a known structure, you can have several views bringing different sets of keys into cohesive units. These different views can read the json data on construction.
In the case of data with a dynamic structure, an authoritative RawDataView can have the data in it's dynamic form (ie. a Magic Container like a HashMap<String, Object>). This can be used to query the dynamic data. At the same time, type-safe wrappers can be created lazily and can delegate to the RawDataView to assist in program readability/maintainability:
public class Customer implements User {
private final RawDataView data;
public CustomerView(UserView source) {
this.data = source.getView(RawDataView.class);
}
// All User views must specify this
#Override
public long id() {
return data.getId();
}
#Override
public <T extends UserView> T getView(Class<T> view) {
// construct or look up view
}
#Override
public Json toJson() {
return data.toJson();
}
//
// Specific to Customer
//
public List<Item> shoppingCart() {
List<Item> items = (List<Item>) data.getValue("items", List.class);
}
// etc....
}
I've had success with both of these approaches. Here are some extra pointers that I've discovered along the way:
Have a static structure structure to your data as much as possible. This makes things a lot easier to maintain. I had to break this rule and use the RawDataView approach when working on a legacy system. You may also have to break it with dynamically-entered user data as mentioned above. In which case, use a convention for non-dynamic field names such as a leading underscore (_userId)
Have equals() and hashcode() implemented such that user.getView(A.class).equals(user.getView(B.class)) is always true for the same user.
Have a UserCore class that does all the heavy lifting of common code such as creating views; performing common operations (like toJson()) returning common fields (like userId()); and implementing equals() and hashcode(). Have all views delegate to this core object
Have an AbstractUserView that delegates to the UserCore and implements equals() and hashcode()
Use a type-safe heterogeneous container (like ClassToInstanceMap) constructing/caching views.
Allow the existence of a view to be queried. This can be done with either a hasView() method or by having getView return Optional<T>
You can always have a class which provides both:
easy access to attributes you know about and optional fallback cases to older formats (for example it can return "name" if it exists, or older case of "name.first" + "name.last" if it doesn't (or some similar scenario))
easy access to unknown elements simulating the map interface
Whether you do a full validation or not, whether you allow extra undefined attributes or not depends on what you want to achieve. But I think that creating an abstraction which allows you either way of accessing the data is the best solution.
Hopefully over time, you'll get to the stage where your schema is pretty much stable and messing directly with the attributes is not needed anymore.
This is not well solved in Java due to the lack of dynamic types. One way this can be solved is using Maps.
Map
The object can again be a Map of objects.
This is not an elegant way but works in Java. An example : SnakeYaml library for YAML allows traversal in this way.
Let's say you have a Client and a Server that wants to share/synchronize the same Models/Objects. The models point to each other, and you want them to keep pointing at the same object after being sent/serialized between the client and the server. My current solution roughly looks like this:
class Person {
static Map<Integer,Person> allPeople;
int myDogId;
static Person getPerson(int key){
return allPeople.get(key);
}
Dog getMyDog() {
return Dog.getDog(myDogId);
}
}
class Dog {
static Map<Integer,Dog> allDogs;
int myOwnersId;
static Dog getDog(int key) {
return allDogs.get(key);
}
Person getMyOwner() {
return Person.getPerson(myOwnersId);
}
}
But i'm not too satisfied with this solution, fields being integer and stuff. This should also be a pretty common problem. So what I'm looking for here is a name for this problem, a pattern, common solution or a library/framework.
There are two issues here.
Are you replicating the data in the Client and the Server (if so, why?) or does one, the other, or
a database agent hold the Model?
How does each agent access (its/the) model?
If the model is only held by one agent (Client, Server, Database), then the other agents
need a way to remotely query the model (e.g., object enumerators, getters and setters for various fields)
operating on abstract model entities (e.g, model element identifiers, which might be implemented
as integers as you have done).
Regardless of who holds the model (one or all), each model can be implemented naturally.
THe normal implementation has each object simply refer to other objects using normal object references,
as if you had coded this without any thought of sharing between agents, and unlike what
you did.
You can associate an objectid with each object, as you have, but your application
code doesn't need to use it; it is only necessary when referencing a remote copy of
of the model. Whether this objectid is associated with each object as a special
field, a hash table, or is computed on the fly is just an implementation detail.
One way to handle this is to compute the objectid on-the-fly. You can do this
if there is a canonical spanning tree over the entire model. In this case,
the objectid is "just" the path from root of the spanning tree to the location
of object. If you don't have a spanning tree or it is too expensive to compute,
you can assign objectids as objects are created.
The real problem with a duplicated, distributed model as you have suggested you have,
is keeping it up to date with both agents updating it. How do you prevent
one from creating an object (an assigning an objectid) at the same time
as the other, but the objects being created are different with the same objectid,
or the same with with different Objectids? You'll need remote locking
and signalling to keep the models in sync (this is the same problem as
"cache coherency" for multiple CPUs; just think of each object as acting like a cache line). The way it is generally solved
is to designate who holds the master copy (perhaps of the entire model,
perhaps of individual objects within the model) and then issue queries,
reads, reads-with-intent-to-modify, or writes to ensure that the
"unique" entire model gets updated.
The only solution I am aware of is to send the complete structure, i.e. Dogs and Persons over the network. Then they will end up pointing to the correct copy on the other side of the network. The implementation of this solution however depends on a lot of circumstances. For example when your inclusion relation defines a tree you can go at this problem differently than if it is a graph with cycles.
Have a look at this for more information.
I guess one can use the proxy pattern for this.