Should my DAOs (Database Entities) Directly match my UI Objects?

Should my DAOs (Database Entities) Directly match my UI Objects? - java

I am trying to figure out best practice for N-Tier application design. When designing the objects my UI needs and those that will be persisted in the DB some of my colleagues are suggesting that the objects be one in the same. This doesn't not feel right to me and I am ultimately looking for some best practice documentation to help me in this decision.
EDIT:
Let me clarify this by saying that the tables (Entity Classes) that are in the DB are identical to the objects used in the UI
I honestly do not understand why I would want to design this way given that other applications may want to interact with my Data Access Layer....or it is just ignorance or lack of understanding on my part.
Any documentation, information you could provide would be greatly appreciated. Just want to better understand these concepts and I am having a hard time finding some good information on the best practice for implementing these patterns (Or it is right in front of me on what I found and I didn't understand what was being outlined).
Thanks,
S

First of all, DAOs and database entities are two very different things.
Now to the question. You're right. The database entities are mapped to a database schema, and this database schema should follow the database design best practices, and be normalized. The UI sometimes dislays exactly the information from a given entity, but often show data that comes from multiple entities, in an aggregate format. Or, to the contrary, they only show a small part of a given entity.
For example, it would make sense for a UI to show a product name, description and price along with the name of its category, along with the number of remaining items in stock, along with the manufacturer of the product. It would make no sense to have a persistent entity containing all those fields.

In general and according to most "best practices" comments, yes, those two layers should be decoupled and there should be separate objects.
BUT: if your mapping would only be a one-to-one-mapping without any further functionality in the non-database-object, why introduce an additional object? So, it depends. (as usual ;-) ).
Don't use additional objects if the introduced overhead is bigger than the gain. And don't couple the two layers if re-usability is a first-class-goal. That may not be the case with some legacy applications, e.g.

Related

DDD structure example

I am trying to structure an application using DDD and onion/hexagonal/clean architecture (using Java and Spring). I find it easier to find guidance on the concepts themselves than actually how to implement them. DDD in particular seems rather tricky to find examples that are instructive because each problem is unique. I have seen numerous examples on SO that have been helpful but I still have questions. I wonder whether going through my example would help me and anyone else.
I hope you can forgive me asking more than one question here. The example seems too big for it to make sense me repeating it in multiple questions.
Context:
We have an application that should display information about soccer stats and has the following concepts (for simplicity I have not included all attributes):
Team, which has many Players.
Player.
Fixture, which has 2 Teams and 2 Halves.
Half, which has 2 FormationsPlayed and many Combinations.
FormationPlayed, which has many PositionsPlayed.
PositionPlayed, which has 1 Player and a position value object.
Combination, which can be of 2 types, and has many Moves.
Move, which can be of 2 types, has 1 Player and an event value object.
As you can imagine, trying to work out which things are aggregate roots here is tricky.
Team can exist independently so is an AR.
Player can exist independently so is an AR.
Fixture, when deleted, must also delete its Halves, so is an AR.
Half must be an entity in Fixture.
FormationPlayed must be deleted when a half is deleted, so perhaps this should be an entity in Half.
PositionPlayed must be deleted when a Formation is deleted, so believe this should be an entity in FormationPlayed.
Combination in a sense can exist independently, though is tied to a particular game half. Perhaps this could be an AR tied by eventual consistency.
Move must be deleted when a Combination is deleted, so believe this should be an entity in Combination.
Questions:
Do you see any errors in the above design? If so what would you change?
The Fixture - Half - FormationPlayed - PositionPlayed aggregate seems too large, so I wonder whether you would agree that this could be split into Fixture - Half and FormationPlayed - PositionPlayed using eventual consistency. The thing I can't find an example of is how this is implemented in Java? If Fixture were deleted, would you fire a FixtureDeleted event that causes its corresponding FormationPlayed entities to also be deleted?
I want to construct a domain model that has no understanding of the way that it will be persisted (as per onion architecture). My understanding is that domain entities here should not have surrogate keys because this relates to persistence. I also believe that entities should only reference entities in other aggregates by ids. How then, for example, would PositionPlayed reference Player in the domain model?
Initially the aim is only to allow the client to get the data and display it. Ultimately I want clients to be able to perform CRUD themselves, and I want all invariants to be held together by the domain model when this happens. Would it simplify things (and can you show me or point me to example explaining how) to have two domain models, one simple for data retrieval and one rich for the operations to be performed later? Two BCs, as it were. The reason I ask is that a rich domain model seems rather time consuming to come up with when initially we only want to display stats in the database, but I also don't want to create trouble for myself down the line if it is better to create one rich domain model now in view of the usecases envisioned later. I wonder, if I were to create a simpler model for data retrieval only, which concepts in DDD could be ignored (would I still need to break up large aggregates, for example?)
I hope this all makes sense. Obviously happy to explain further if needed. Realise I'm asking a lot here and I may have confused some ideas. Any answers and wisdom you can give to this would be greatly appreciated !

Do you see any errors in the above design? If so what would you change?
There might be a big one: is your system the book of record? or is it just keeping track of events that happen in the "real world". In a sense, the point of aggregates is to ensure that the book of record is internally consistent, but if you aren't the book of record....
For an example of what I mean
http://www.soccerstats.com/ -- the book of record is the real world.
https://www.easports.com/fifa -- the games are played in the computer
If Fixture were deleted, would you fire a FixtureDeleted event that causes its corresponding FormationPlayed entities to also be deleted?
Udi Dahan wrote: Don't Delete, Just Don't. If an entity has a lifecycle, and that lifecycle has an end, then you mark it, but you don't remove the entity.
I want to construct a domain model that has no understanding of the way that it will be persisted (as per onion architecture)
Great! Be warned, a lot of the examples that you will find online don't get this part right -- for historical reasons, many demonstrations of model are tightly coupled to the side effects that they have on persistence.
My understanding is that domain entities here should not have surrogate keys because this relates to persistence. I also believe that entities should only reference entities in other aggregates by ids. How then, for example, would PositionPlayed reference Player in the domain model?
Ah -- OK, this one is fun. Don't confuse surrogate keys used in the persistence layer with identifiers in the domain model. For instance, when I look at my purchasing history on Amazon, each of my orders (presumably an aggregate) has an ORDER # associated with it. That would imply that the domain level knows about OrderNumber as a value type. The persistence solution in the back end might introduce surrogate keys when storing that data, but those keys are not used by the model.
Note that's I've chosen an example where the aggregate is clearly the authority -- the order only really exists within the model. When the real world is the book of record, you often don't have a unique identifier available (what is Lionel Messi's PlayerId?)
The reason I ask is that a rich domain model seems rather time consuming to come up with when initially we only want to display stats in the database
A couple of thoughts on this -- ddd is usually saved for more complicated use cases (Greg Young: "is this where you get a competitive advantage?"). Most of the power of aggregates comes from the fact that they ensure the consistency of changes of state. When your real problem is data entry and reporting, it tends to be overkill.
Detection and remediation of inconsistencies is often easier/cheaper than trying to get prevention right; and may be satisfactory to the business, given the costs. Something to keep in mind.
The application is keeping track of events in the real world. At the moment, they are recorded manually in a database. Can you be explicit why you believe the distinction is important?
Very roughly -- events indicate things that have already happened. It's too late for the domain to veto them; the real world is outside of the domain's control.
Furthermore, we have to keep in mind that, since the real world is the book of record, things may have happened in the real world that our domain model doesn't know about yet (the reporting of events may be delayed, lost, reordered, and so on).
Aggregates are supposed to be a source of truth. Which means that they can only govern entities in the digital world.
One kind of information resource that you could create is a report of Messi's goals in a season. So every time a goal is reported, you run a command to update the report aggregate. That's not anemic -- not exactly -- but it's not very interesting. It's really just a view (in CQRS terms, it's a read model) that you can recreate from the history of events. It doesn't have any intelligence in it.
The interest aggregates are those that make decisions for themselves, based on the information that they are given.
A contrived example of an aggregate would be one that, if a player scores more than 10 goals in a season, orders that players jersey for you. Notice that while "goals" are something already present in your event stream, the business rule doesn't. That's purely a domain model thing.
So the way that this would work is that each time a goal event appeared, you would load the JerseyPerchasing aggregate, and tell it about the goal. And that aggregate would make sure that this was a new goal (not one that had previously been reported), and determine if the number of goals called for ordering a shirt, check to see if the order for the shirt had already been placed.
Key idea here -- the goals are something that the aggregate is told about. The decision to purchase a jersey is made by the aggregate, and shared with the world.
Later, you realize that sometimes a player gets traded, and then scores a 10th goal. And you have to determine as a business whether that means you get one shirt (which?) or one shirt for each jersey, or maybe you only order jerseys if he scored 10 goals for a specific team in a season. All of this logic goes into the aggregate.
a domain model as per onion architecture that, can you point me to any good examples?
Best place to look, as weird as it sounds, is among the functional programming types. Mark Seemann's blog includes a lot of important ideas that will help here.
The main idea to keep in mind that the model sits at the bottom. The app passes state to the model, and gets state back (in CQS terminology, you query the model). The app is responsible for sharing the results obtained from the model with the persistence component.
do you believe the accepted view would be that an anaemic model should be adopted for a domain this size
In the case where you are just re-organizing information from the real world for easier consumption? Yeah - load document, update document, store document makes a lot more sense to me than going overboard with a bunch of aggregate modeling. But don't read too much into that -- I don't know more about your model than what you have written here. If there's real business complexity in how you evaluate the information from the real world, then the answer would be different.

Using same Entity classes in different spring data repositories

I'm trying to put together a project in which I have to persist some entity classes using different spring data repositories (gemfire, jpa, mongodb etc). As the data is more or less the same that needs to go into these repositories, I was wondering if I can use the same entity class for all of them to save me from converting from one object to another?
I got it working for gemfire and jpa but the entity class is already starting to looking a bit wired.
#Id // spring-data-gemfire
#javax.persistence.Id // jpa
#GeneratedValue
private Long id;
So far I can see following options:
Create an interface based separate Entity (domain) classes - Trying to re-use same class looks like a bit of premature optimization.
Externalize xml based mapping for JPA, not sure if gemfire and mongodb mapping can be externalized.
Use different concrete entity classes and use some copy constructor/converter for the conversion.
Been literally hitting my head against the wall to find the best approach - Any response is much appreciated. Thanks

If by weird, you mean your application domain objects/entity classes are starting to accumulate many different, but separate (mapping) annotations (some semantically the same even, e.g. SD Common's o.s.data.annotation.Id and JPA's #javax.persistence.Id) for the different data stores in which those entities will be persisted, then I suppose that is understandable.
The annotation pollution only increases too as the number of representations for your entities increases. For example, think Jackson annotations for JSON mapping or JAXB for XML, etc. Pretty soon, you have more meta-data then actual data, :-)
However, it is more a matter of preference, convenience, simplicity, really.
Some developers are purists and like to externalize everything. Others like to keep information (meta-data) close to the code using it. Even certain patterns have emerged to address these type of concerns... DTOs, Bounded Contexts (see Fowler's BoundedContext, which has a strong correlation to DDD and Microservices).
Personally, I use the following rules when designing and applying architectural principals/decisions in my code, especially when introducing something new:
Simplicity
Consistency
DRY
Test
Refactor
(along with a few others as well... good OOD, SoC, SOLID, Design Patterns, etc).
In that order too. If something starts getting too complex, refactor and simplify it. Be consistent in what you do by following/using patterns, conventions; familiarity is 1 key to consistency. But, don't keep repeating yourself either.
At the end of the day, it is really about maintaining the application. Will someone else who picks up where you left off be able to understand the organization and logic quickly, and be able to maintain it... simplicity is king. It does not mean it is so simple it is not viable or valuable. Even complex things can be simple if organized properly. However, breaking things apart and introducing abstractions can have hidden costs (see closing thoughts).
To more concretely answer (a few of) your questions...
I am not certain about MongoDB, but (Spring Data) GemFire does not have an external mapping. Minimally, #Region (on the entity class) and #Id are required, along with #PersistenceConstructor if your entity class has more than 1 constructor. For example.
This sounds sneakingly like to DTOs. Personally, I think BoundContexts are a better, more natural model of the application's data since the domain model should not be unduly tied to any persistent store or external representation (e.g. JSON, XML, etc). The application domain model is the 1 true state of the application and it should model the concept that is represents in a natural way, not superficially to satisfy some representation or persistent store (hence the mapping/conversion).
Anyway, try not to beat yourself up too much. It is all about managing complexity. Try to let yourself just do and use testing and other feedback loops to find an answer that is right for your application. You'll know.
Hope this helps.

Always 'Entity first' approach, when designing java apps from scratch?

I'm just reading the book here: http://www.amazon.com/Java-Architects-Handbook-Second-Edition/dp/0972954880/ trying to find a strategy about how to efficiently design a (generic) medium to large application (200 tables or more) - for instance a classic, multi-layered, corporate intranet. I'm trying to adapt my past experience (as a database designer, but also OOAD) in order to architect such a java application. From what I've read, if you define your entities first, there is no recommended way to infer your database directly (automatically).
The book says that you would build the entity/object model first (OOAD) and THEN there is the db admin/dev.(?) job to build/infer the database (schema, normalization etc.) based on the entity model already built. If this is the case, I'm afraid the architect/developer could lose control over important aspects - normalization, entity-attribute-value modeling etc.
Perhaps like many older developers (back-end developers, architects etc) I feel more comfortable defining the database schema first - and spending a good amount of time on aspects like normalization etc. While this would be certainly possible nowadays, I'm asking myself if this would become (pretty soon, if not already) the 'old fashioned way' and not the norm - as a classic/recommended approach when designing applications from scratch.
I know Entity Framework (.NET) already have these approaches explicitly defined - 'entities first', 'database first', 'code first' and and these could be mixed, if necessary. I surely know that they recommend 'entity first' for newly designed apps, and 'database first' if you have already defined database schema (which is the case for many older applications, when migrating etc. I'm just asking if there is something similar for the java world.
So, the questions are: (although I know there is no silver bullet etc.)
'Entities first' for newly built apps - this is the norm nowadays?
What tools do you use (if any) in order to assist inferring db schema process? - your experience, pros & cons with concrete UML
tools etc.
What if you have parts/older/sub-domain database schema (which you'd want to preserve, mainly)? In such case, you would infer entities model from
database and then refactor the model using your preferred UML tool?
From labor force perspective (let's say for db of 200-500 tables): what is the best approach: for instance, to have 2 different people
involved in designing OOAD/entities and database respectively,
working together with an architect?

As you expect - my answer is it depends.
The problem is that there are so many possible flavours and dimensions to a good design you really need to take the widest view possible first.
Ask yourself some of the big questions:
Where is the core of the system? Is the database really the core or is it actually just a persistence layer for the code. It could also perhaps be that the database is the core and the code is really just a snazzy UI on the data. There can also be a mix - where some of the tables are core along with some of the entities.
What do you see in the future? Remember that there are developments going on as we speak that are moving database technology rapidly forward. There are some databases that are all in-ram. Some are designed for a distributed architecture. Some are primarily cloud. If you build your schema first you risk locking yourself in to a certain technology.
What scale do you want to achieve? By insisting on a specific database you may be closing doors to perhaps hand-held presence.
I generally find entity first as the best initial approach because you can always derive a schema from the entities and some meta-data. It is certainly possible to go schema first and grow the entities out of the schema but that way you generally find the database influences the design too much.

1) I've done database first in the past but now I usually do Entity first but that's mainly because of the tools I'm using in creating the applications. Entity first has a few good advantages over trying to match your entities to your defined schema later. You're also not locking yourself to tightly to your schema. What your application is for matters alot as well, if it's just a basic CRUD application, write once read many or does it actually 'do' something that will inform your choice over how to architect your application.
2) I use hibernate a lot which encourages creating your model first, designing all your entities etc and then generating the schema from that, hibernate can generate your whole schema from the models you've created (though you may need to tweak them to make sure they're not crazy). If you have 200 entities in your model then you probably want to do a significant amount of UML modelling ahead of time to ensure your model is consistent.
3) If you're working with partially legacy database then it can sometimes be good to fall into line with the schema design for that so your entities and schema are consistent. It can be a bit of a pain but then so it trying to explain why part of your app is just different to other parts. So yes I would probably infer my entities from the schema in that case. But again if it was totally crazy then it may be to do some very specific DAO code to hide that part of the schema from that app and pretend it's not there.
4) I can't really give you a good answer on this as I'm not sure what you're driving at really. Once you have the design standards for your schema it's turning the handle to crank it out.
So after all that my answer is 'It depends'

While the answers already posted cover a lot of points - and ultimately, all answers probably have to all sum up to "it depends" - I'd like to expand on a point that's been touched on already.
My focus is on data - I'm a business intelligence and data warehousing developer, and I deal with issues like data quality, data governance, having a set of master data, etc. To this end, I have to pull data from other systems - data which is in varying conditions.
When considering whether the core of your system is really the database or the front end (as suggested by OldCurmudgeon), I strongly suggest thinking outside of your own area. I have seen and heard about many systems where it's clear that the database has been treated as an afterthought (sometimes created via an entity-first model, but also sometimes hand-built), despite the fact that most of the business value is in the data. More and more companies are of course realising that their data is valuable and are adopting tools to make use of it - but it's difficult to do if poor transactional databases mean that data has been lost, was never saved in the first place, has been overwritten when a history is needed, or is inconsistent.
While I don't want to do myself and others with similar roles out of a job: If the data that a system you're working on is or might be valuable, if there's any reason it might be accessed by anything other than the front end you're creating, then it is worth the time and effort to create a sound data model to hold it. If the system is for an organisation or is going to be sold to organisations, there's a decent chance they'll want to report out of it, will want to run output from it into a data warehouse or other data stores, and will want to carry out analysis on the data it creates and holds.
I don't know enough about tools like Hibernate to know if it's possible to both use them to work in an entity-first manner and still create a good quality database, but I know that I have come across some problematic databases created in this manner. At the very least, as has been suggested, if you are going to work that way, make sure it is producing something sane and perhaps adjust it where necessary to maintain data integrity. If data integrity is a key requirement and you cannot get such a tool to create a suitable database that will ensure data integrity, then perhaps consider going back to doing things the "old fashioned" way.
I would also suggest that there's real value in developers working alongside any data specialists, analysts, architects, etc. they may have as colleagues to do some up-front modelling, even if the system they then produce uses entity-first and even if it veers away from the more conceptual models produced early on for technical reasons. I have seen many baked-in problems in systems which have been caused by a lack of understanding of the wider business entities and relationships, and which could have been avoided if time had been spent understanding the overall structure in this way. I've been personally responsible for building those problems when I was an application developer myself, so this shouldn't be read as criticism of front-end developers - just a vote in favour of cross-functional and collaborative analysis and modelling before development approaches and designs are decided.

Dynamic Typed Table/Model in Java EE?

Usually with Java EE when we create Model, we define the fields and types of fields through XML or annotation before compilation time. Is there a way to change those in runtime? Or better, is it possible to create a new Model based on the user's input during the runtime? Such that the number of columns and types of fields are dynamic (determined at runtime)?
Help is much appreciated. Thank you.
I felt the need to clarify myself.
Yes, I meant database modeling, when talking about Model.
As for the use cases, I want to provide a means for users to define and create their own tables. Infinite flexibility is not required. However some degree of freedom has to be there: e.g. the users can define what fields are needed to describe their product.

You sound like you want to be able to change both objects and schema according to user input at runtime. This sounds like a chaotic recipe for disaster to me. I've never seen it done.
I have seen general schemas that incorporate foreign key relationships to generic tables of name/value pairs, but these tend to become infinitely flexible abstractions that can neither be easily understood nor get out of their own way when it comes to performance.
I'm betting that your users really don't want infinite flexibility. I'd caution you against taking this direction. Better to get your real use cases straight.
Anything is possible, of course. My direct experience tells me that it's a bad idea that your users will hate if you can pull it off. Best of luck.

I worked on a system where we had such facilities. To stay efficient, we would generate/alter the table dynamically for the customer schema. We also needed to embed a meta-model (the model of the model) to process information in the entities dynamically.
Option 1: With custom tables, you have full flexibility, but it also increases the complexity significantly, notably the update/migration of existing data. Here is a list of things you will need to consider:
What if the type of a column change?
What if a column is added? Is there a default value?
What if a column is removed? Can I discard the existing information?
How to manage renaming of a column?
How to make things portable across databases?
How to make it efficient at database-level (e.g. indexes) ?
How to manage a human error (e.g. user removes a column then changes its mind)?
How to manage migration (script, deployment, etc.) when new version of the system is installed at customer site?
How to have this while using an ORM?
Option 2: A lightweight alternative is to add a few "spare" columns in the business tables of different types (e.g.: "USER_DATE_1", "USER_DATE_2", etc.) I've seen that a few times. It will makes your DBA scream and is not really considered a good practice, but at least can facilitates a few things, e.g. (migration scripts, ORM integration).
Option 3: Another option is to store everything in a table with a structure property/data. But then it's really a disaster for database performance. Anything that is not completely trivial will require many joins. And the DBA will scream even more.
Option 4: It is a mix of options 2 and 3. Core tables are fixed, but a table with property/data can be used to somehow extend them.
In summary: think twice before you go this way. It can be done, but has a significant impact on the design and maintenance of the application.

This is somehow possible using meta-modeling techniques:
tables for table / column / types at the database level
key/value structures at the Java level
But this has obvious limitations (lack of strong typed objects) and can IMHO get quickly very complicated (not even sure how to deal with relations). I wouldn't use this approach to define domain objects entirely, but only to extend existing ones (products, articles, etc).
If I remember well, this is what some e-commerce solutions (e.g. BroadVision) were doing.

I think I have found a good answer myself. Those new no-sql (hbase, cassandra) database seems to be exactly what I was looking for. Thanks everyone for your answeres.

hibernate workflow

I'm trying to write a program with Hibernate. My domain is now complete and I'm writing the database.
I got confused about what to do. Should I
make my sql tables in classes and let the Hibernate make them
Or create tables in the
database and reverse engineer it and
let the hibernate make my classes?
I heard the first option one from someone and read the second option on the Netbeans site.
Does any one know which approach is correct?

It depends on how you best conceptualize the program you are writing. When I am designing my system I usually think in terms of entities and their relationships to eachother, so for me, I start with my business objects, then write my hibernate mappings and let hibernate create the database.
Other people are able to think better in terms of database tables, in whcih case that approach is best for them. So you gotta decide which one works for you based on your experience.

I believe you can do either, so it's down to preference.
Personally, I write the lot by hand. While Hibernate does a reasonable job of creating a database for you it doesn't do it as well as I can do myself. I'd assume the same goes for the Java classes it produces although I've never used that feature.
With regards to the generated classes (if you went the class generation route) I'm betting every field has a getter/setter whether fields should be read only or not (did somebody say thread safety and mutability) and that you can't add behavior because it gets overridden if you regenerate the classes.

Definitely write the java objects and then add the persistence and let hibernate generate the tables.
If you go the other way you lose the benefit of OOD and all that good stuff.

I'm in favor of writing Java first. It can be a personal preference though.
If you analyse your domain, you will probably find that they are some duplication.
For example, the audit columns (user creator and editor, time created and edited) are often common to most tables.
The id is often a common field.
Look at your domain to see your duplication.
The duplication is an opportunity to reuse.
You could use inheritance, or composition.
Advantages :
less time : You will have much less things to write,
logical : the same logical field would be written once (that would be other be many similar fields)
reuse : in the client code for your entities, you could write reusable code. For example, if all your entities have the same id field called ident because of their superclass, a client code could make the generic call object.getIdent() without having to find out the exact class of the object, so it will be more reusable.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.