DDD structure example - java

I am trying to structure an application using DDD and onion/hexagonal/clean architecture (using Java and Spring). I find it easier to find guidance on the concepts themselves than actually how to implement them. DDD in particular seems rather tricky to find examples that are instructive because each problem is unique. I have seen numerous examples on SO that have been helpful but I still have questions. I wonder whether going through my example would help me and anyone else.
I hope you can forgive me asking more than one question here. The example seems too big for it to make sense me repeating it in multiple questions.
Context:
We have an application that should display information about soccer stats and has the following concepts (for simplicity I have not included all attributes):
Team, which has many Players.
Player.
Fixture, which has 2 Teams and 2 Halves.
Half, which has 2 FormationsPlayed and many Combinations.
FormationPlayed, which has many PositionsPlayed.
PositionPlayed, which has 1 Player and a position value object.
Combination, which can be of 2 types, and has many Moves.
Move, which can be of 2 types, has 1 Player and an event value object.
As you can imagine, trying to work out which things are aggregate roots here is tricky.
Team can exist independently so is an AR.
Player can exist independently so is an AR.
Fixture, when deleted, must also delete its Halves, so is an AR.
Half must be an entity in Fixture.
FormationPlayed must be deleted when a half is deleted, so perhaps this should be an entity in Half.
PositionPlayed must be deleted when a Formation is deleted, so believe this should be an entity in FormationPlayed.
Combination in a sense can exist independently, though is tied to a particular game half. Perhaps this could be an AR tied by eventual consistency.
Move must be deleted when a Combination is deleted, so believe this should be an entity in Combination.
Questions:
Do you see any errors in the above design? If so what would you change?
The Fixture - Half - FormationPlayed - PositionPlayed aggregate seems too large, so I wonder whether you would agree that this could be split into Fixture - Half and FormationPlayed - PositionPlayed using eventual consistency. The thing I can't find an example of is how this is implemented in Java? If Fixture were deleted, would you fire a FixtureDeleted event that causes its corresponding FormationPlayed entities to also be deleted?
I want to construct a domain model that has no understanding of the way that it will be persisted (as per onion architecture). My understanding is that domain entities here should not have surrogate keys because this relates to persistence. I also believe that entities should only reference entities in other aggregates by ids. How then, for example, would PositionPlayed reference Player in the domain model?
Initially the aim is only to allow the client to get the data and display it. Ultimately I want clients to be able to perform CRUD themselves, and I want all invariants to be held together by the domain model when this happens. Would it simplify things (and can you show me or point me to example explaining how) to have two domain models, one simple for data retrieval and one rich for the operations to be performed later? Two BCs, as it were. The reason I ask is that a rich domain model seems rather time consuming to come up with when initially we only want to display stats in the database, but I also don't want to create trouble for myself down the line if it is better to create one rich domain model now in view of the usecases envisioned later. I wonder, if I were to create a simpler model for data retrieval only, which concepts in DDD could be ignored (would I still need to break up large aggregates, for example?)
I hope this all makes sense. Obviously happy to explain further if needed. Realise I'm asking a lot here and I may have confused some ideas. Any answers and wisdom you can give to this would be greatly appreciated !

Do you see any errors in the above design? If so what would you change?
There might be a big one: is your system the book of record? or is it just keeping track of events that happen in the "real world". In a sense, the point of aggregates is to ensure that the book of record is internally consistent, but if you aren't the book of record....
For an example of what I mean
http://www.soccerstats.com/ -- the book of record is the real world.
https://www.easports.com/fifa -- the games are played in the computer
If Fixture were deleted, would you fire a FixtureDeleted event that causes its corresponding FormationPlayed entities to also be deleted?
Udi Dahan wrote: Don't Delete, Just Don't. If an entity has a lifecycle, and that lifecycle has an end, then you mark it, but you don't remove the entity.
I want to construct a domain model that has no understanding of the way that it will be persisted (as per onion architecture)
Great! Be warned, a lot of the examples that you will find online don't get this part right -- for historical reasons, many demonstrations of model are tightly coupled to the side effects that they have on persistence.
My understanding is that domain entities here should not have surrogate keys because this relates to persistence. I also believe that entities should only reference entities in other aggregates by ids. How then, for example, would PositionPlayed reference Player in the domain model?
Ah -- OK, this one is fun. Don't confuse surrogate keys used in the persistence layer with identifiers in the domain model. For instance, when I look at my purchasing history on Amazon, each of my orders (presumably an aggregate) has an ORDER # associated with it. That would imply that the domain level knows about OrderNumber as a value type. The persistence solution in the back end might introduce surrogate keys when storing that data, but those keys are not used by the model.
Note that's I've chosen an example where the aggregate is clearly the authority -- the order only really exists within the model. When the real world is the book of record, you often don't have a unique identifier available (what is Lionel Messi's PlayerId?)
The reason I ask is that a rich domain model seems rather time consuming to come up with when initially we only want to display stats in the database
A couple of thoughts on this -- ddd is usually saved for more complicated use cases (Greg Young: "is this where you get a competitive advantage?"). Most of the power of aggregates comes from the fact that they ensure the consistency of changes of state. When your real problem is data entry and reporting, it tends to be overkill.
Detection and remediation of inconsistencies is often easier/cheaper than trying to get prevention right; and may be satisfactory to the business, given the costs. Something to keep in mind.
The application is keeping track of events in the real world. At the moment, they are recorded manually in a database. Can you be explicit why you believe the distinction is important?
Very roughly -- events indicate things that have already happened. It's too late for the domain to veto them; the real world is outside of the domain's control.
Furthermore, we have to keep in mind that, since the real world is the book of record, things may have happened in the real world that our domain model doesn't know about yet (the reporting of events may be delayed, lost, reordered, and so on).
Aggregates are supposed to be a source of truth. Which means that they can only govern entities in the digital world.
One kind of information resource that you could create is a report of Messi's goals in a season. So every time a goal is reported, you run a command to update the report aggregate. That's not anemic -- not exactly -- but it's not very interesting. It's really just a view (in CQRS terms, it's a read model) that you can recreate from the history of events. It doesn't have any intelligence in it.
The interest aggregates are those that make decisions for themselves, based on the information that they are given.
A contrived example of an aggregate would be one that, if a player scores more than 10 goals in a season, orders that players jersey for you. Notice that while "goals" are something already present in your event stream, the business rule doesn't. That's purely a domain model thing.
So the way that this would work is that each time a goal event appeared, you would load the JerseyPerchasing aggregate, and tell it about the goal. And that aggregate would make sure that this was a new goal (not one that had previously been reported), and determine if the number of goals called for ordering a shirt, check to see if the order for the shirt had already been placed.
Key idea here -- the goals are something that the aggregate is told about. The decision to purchase a jersey is made by the aggregate, and shared with the world.
Later, you realize that sometimes a player gets traded, and then scores a 10th goal. And you have to determine as a business whether that means you get one shirt (which?) or one shirt for each jersey, or maybe you only order jerseys if he scored 10 goals for a specific team in a season. All of this logic goes into the aggregate.
a domain model as per onion architecture that, can you point me to any good examples?
Best place to look, as weird as it sounds, is among the functional programming types. Mark Seemann's blog includes a lot of important ideas that will help here.
The main idea to keep in mind that the model sits at the bottom. The app passes state to the model, and gets state back (in CQS terminology, you query the model). The app is responsible for sharing the results obtained from the model with the persistence component.
do you believe the accepted view would be that an anaemic model should be adopted for a domain this size
In the case where you are just re-organizing information from the real world for easier consumption? Yeah - load document, update document, store document makes a lot more sense to me than going overboard with a bunch of aggregate modeling. But don't read too much into that -- I don't know more about your model than what you have written here. If there's real business complexity in how you evaluate the information from the real world, then the answer would be different.

Related

Create aggregate root in the context of another aggregate root

i'm currently struggling with the creation of instances in the ddd context.
i have read and searched alot and sometimes thought that i have found the answer only to realize that it doesnt feel right while programming it.
This is my situation:
I have two aggregate roots Scenarioand Step. I made those AR
because they encapsulate related elements of the domain and each AR
should be in a consistent state.
Multiple Steps can exist in the
context of a Scenario. They can not exist on their own.
The "name/natural id" of each Step in the context of its Scenario has to be unique. Changes in Scenario do not automatically influence its Steps and
vice versa (e.g. Step doesnt care if Scenario changes some
descriptions or images).
Different Steps of a Scenario can be used, edited, etc. at the same time.
At the moment, each Step holds a reference to its Scenario by the corresponding natural identifier. The Scenario class doesnt know anything about its Steps, so it does not hold a collection with Step references.
How would i create a new Stepfor a given Scenario?
Should i load the Scenario and call something like
createNewStep(...) on it? That would not enforce the uniqueness
constraint (that is in fact a business constraint and not a
technical one), because Scenario doesnt know about its Steps. I would probably have to go with some kind of a "disconnected domain model" then or pass a repsoitory or service to the method to perform the checks.
Should i use a domain service that enforces the constraint, queries the repository, and finally creates and returns the Step?
Should Scenario simply know about its Steps? I think i would like to avoid this one, since that would create a ugly-to-maintain bidirectional relationship.
One could imagine other use cases like a Step shall be classified by options that are provided by the specific Scenario. In this case and if there would be no constraints regarding the "collection" of Steps, i would probably go with the first "solution". Then again: if the classification is changed afterwards, the access to the scenario would be necessary to check for the allowed classifications. That brings me to a possible 4th solution:
Using some kind of "combination" of some possible solutions. Would it be a good idea to create the domain service (accessing everything needed) and use it as an argument of the method that needs it? The method would then call the service where needed and the "domain logic" stays in the entity/model.
Thank you in advance!
I'll just edit instead of copy paste answering ;)
Thank you all for your responses! :)
Pushing the steps back into the scenario would lead to some pretty big objects which i'm trying to avoid (the current running application really suffers from this). It seems that its pretty much alike the Scrum-Example of Vaughns "Effective Aggregate Design" where he is using DomainServices to get smaller aggregates (i really dont know why i'm so uncertain about using domain services). Looks like i'll have to use domainservices or split the aggregates up into "StepName" and "StepDetails" as suggested.
For background, you should read what Greg Young has to say about set validation (via WaybackMachine). In particular, you really need to evaluate, in the context of your solution, what is the business impact of having a failure?
Accept the failure and escalate is by far your easiest option. In what follows, I assume that the business impact of the failure is large, so we need to prevent it from happening.
The "name/natural id" of each Step in the context of its Scenario has to be unique
That's a classic set validation concern.
The first thing to do is challenge the assumptions in your model
Is your model the book of record for "name"? If your model isn't the authority, you have to be very cautious about introducing constraints. Understanding the boundaries of your model's authority is really important.
Is there an invariant that couples the name of a step to any other part of its state? Aggregate design discipline says that two pieces of state coupled by an invariant need to be in the same aggregate, but its silent about properties that don't participate in an invariant.
Is it reasonable to reject a name change while accepting other changes to a step? This is really a variation of the previous -- can tasks be split into two different commands (one involving name, one not) that can succeed or fail independently?
In short, the invariant may be telling you that "step name", as a piece of state, belongs in the scenario aggregate rather than in the step aggregate.
If you think about the problem from the perspective of a relational model, we're looking at a tuple (scenarioId, name, stepId), and the constraint says that (scenarioId, name) form a unique key. That's a hint that step name belongs to the scenario. In code, that signature looks like a scenario data structure that includes a Map<ScenarioName, ScenarioId>.
That won't necessarily solve all of your problems of course, but it is a step toward aligning the model with your actual business.
When that doesn't work...
The "real" answer is to move the step entity back into the scenario aggregate. One way to think about it is this -- all of the entities taken together form "the model" that we are keeping consistent. The aggregates aren't part of the business, per se; they are artificial, independent subdivisions within the model -- we identify and isolate aggregates as a performance optimization; we can perform concurrent edits, and evaluate the validity of a command while loading a much smaller data set.
If the failures make the performance optimization too expensive, you take it out. So you can see that we have an estimate, of sorts, for what it means that the business impact is "large"; it needs to be bigger than the savings we get from using aggregates on the happy path.
Another possibility is to shift where you enforce the invariant. Relational databases are really really good at set validation. So maybe the right answer is to split the enforcement concern: put the invariant into your schema as a constraint, and ignore that constraint in code.
This isn't ideal for a number of reasons -- you've effectively "hidden" the constraint, you've introduced a constraint on the kind of data store that you use for your aggregates, you've introduced a constraint that requires that you store your step aggregates in the same database as the scenario they belong to, and so on. If you squint, you'll see that this is really just the "make the step entities part of the scenario" solution, but in disguise.
But keep in mind: part of the point of domain-driven-design is that we can push back on the business when the code is telling us that the business model itself is wrong. Where's the cost benefit analysis?
Here's the thing about uniqueness constraints: the model enforces uniqueness, not correctness. Imagine a data race, two different commands that each claim the same "name" for a different step in the scenario -- perhaps caused by a data entry error. The model, presumably, can't tell which command is "right", so it's going to make some arbitrary guess (most likely, first command wins). If the model guesses wrong, it has effectively blocked the client that provided correct data!
In cases where the model is the authority, uniqueness constraints can make sense -- the SeatMap aggregate can enforce the constraint that only one ticket can be assigned to a seat at any given time, because it is the authority for assignment.

Always 'Entity first' approach, when designing java apps from scratch?

I'm just reading the book here: http://www.amazon.com/Java-Architects-Handbook-Second-Edition/dp/0972954880/ trying to find a strategy about how to efficiently design a (generic) medium to large application (200 tables or more) - for instance a classic, multi-layered, corporate intranet. I'm trying to adapt my past experience (as a database designer, but also OOAD) in order to architect such a java application. From what I've read, if you define your entities first, there is no recommended way to infer your database directly (automatically).
The book says that you would build the entity/object model first (OOAD) and THEN there is the db admin/dev.(?) job to build/infer the database (schema, normalization etc.) based on the entity model already built. If this is the case, I'm afraid the architect/developer could lose control over important aspects - normalization, entity-attribute-value modeling etc.
Perhaps like many older developers (back-end developers, architects etc) I feel more comfortable defining the database schema first - and spending a good amount of time on aspects like normalization etc. While this would be certainly possible nowadays, I'm asking myself if this would become (pretty soon, if not already) the 'old fashioned way' and not the norm - as a classic/recommended approach when designing applications from scratch.
I know Entity Framework (.NET) already have these approaches explicitly defined - 'entities first', 'database first', 'code first' and and these could be mixed, if necessary. I surely know that they recommend 'entity first' for newly designed apps, and 'database first' if you have already defined database schema (which is the case for many older applications, when migrating etc. I'm just asking if there is something similar for the java world.
So, the questions are: (although I know there is no silver bullet etc.)
'Entities first' for newly built apps - this is the norm nowadays?
What tools do you use (if any) in order to assist inferring db schema process? - your experience, pros & cons with concrete UML
tools etc.
What if you have parts/older/sub-domain database schema (which you'd want to preserve, mainly)? In such case, you would infer entities model from
database and then refactor the model using your preferred UML tool?
From labor force perspective (let's say for db of 200-500 tables): what is the best approach: for instance, to have 2 different people
involved in designing OOAD/entities and database respectively,
working together with an architect?
As you expect - my answer is it depends.
The problem is that there are so many possible flavours and dimensions to a good design you really need to take the widest view possible first.
Ask yourself some of the big questions:
Where is the core of the system? Is the database really the core or is it actually just a persistence layer for the code. It could also perhaps be that the database is the core and the code is really just a snazzy UI on the data. There can also be a mix - where some of the tables are core along with some of the entities.
What do you see in the future? Remember that there are developments going on as we speak that are moving database technology rapidly forward. There are some databases that are all in-ram. Some are designed for a distributed architecture. Some are primarily cloud. If you build your schema first you risk locking yourself in to a certain technology.
What scale do you want to achieve? By insisting on a specific database you may be closing doors to perhaps hand-held presence.
I generally find entity first as the best initial approach because you can always derive a schema from the entities and some meta-data. It is certainly possible to go schema first and grow the entities out of the schema but that way you generally find the database influences the design too much.
1) I've done database first in the past but now I usually do Entity first but that's mainly because of the tools I'm using in creating the applications. Entity first has a few good advantages over trying to match your entities to your defined schema later. You're also not locking yourself to tightly to your schema. What your application is for matters alot as well, if it's just a basic CRUD application, write once read many or does it actually 'do' something that will inform your choice over how to architect your application.
2) I use hibernate a lot which encourages creating your model first, designing all your entities etc and then generating the schema from that, hibernate can generate your whole schema from the models you've created (though you may need to tweak them to make sure they're not crazy). If you have 200 entities in your model then you probably want to do a significant amount of UML modelling ahead of time to ensure your model is consistent.
3) If you're working with partially legacy database then it can sometimes be good to fall into line with the schema design for that so your entities and schema are consistent. It can be a bit of a pain but then so it trying to explain why part of your app is just different to other parts. So yes I would probably infer my entities from the schema in that case. But again if it was totally crazy then it may be to do some very specific DAO code to hide that part of the schema from that app and pretend it's not there.
4) I can't really give you a good answer on this as I'm not sure what you're driving at really. Once you have the design standards for your schema it's turning the handle to crank it out.
So after all that my answer is 'It depends'
While the answers already posted cover a lot of points - and ultimately, all answers probably have to all sum up to "it depends" - I'd like to expand on a point that's been touched on already.
My focus is on data - I'm a business intelligence and data warehousing developer, and I deal with issues like data quality, data governance, having a set of master data, etc. To this end, I have to pull data from other systems - data which is in varying conditions.
When considering whether the core of your system is really the database or the front end (as suggested by OldCurmudgeon), I strongly suggest thinking outside of your own area. I have seen and heard about many systems where it's clear that the database has been treated as an afterthought (sometimes created via an entity-first model, but also sometimes hand-built), despite the fact that most of the business value is in the data. More and more companies are of course realising that their data is valuable and are adopting tools to make use of it - but it's difficult to do if poor transactional databases mean that data has been lost, was never saved in the first place, has been overwritten when a history is needed, or is inconsistent.
While I don't want to do myself and others with similar roles out of a job: If the data that a system you're working on is or might be valuable, if there's any reason it might be accessed by anything other than the front end you're creating, then it is worth the time and effort to create a sound data model to hold it. If the system is for an organisation or is going to be sold to organisations, there's a decent chance they'll want to report out of it, will want to run output from it into a data warehouse or other data stores, and will want to carry out analysis on the data it creates and holds.
I don't know enough about tools like Hibernate to know if it's possible to both use them to work in an entity-first manner and still create a good quality database, but I know that I have come across some problematic databases created in this manner. At the very least, as has been suggested, if you are going to work that way, make sure it is producing something sane and perhaps adjust it where necessary to maintain data integrity. If data integrity is a key requirement and you cannot get such a tool to create a suitable database that will ensure data integrity, then perhaps consider going back to doing things the "old fashioned" way.
I would also suggest that there's real value in developers working alongside any data specialists, analysts, architects, etc. they may have as colleagues to do some up-front modelling, even if the system they then produce uses entity-first and even if it veers away from the more conceptual models produced early on for technical reasons. I have seen many baked-in problems in systems which have been caused by a lack of understanding of the wider business entities and relationships, and which could have been avoided if time had been spent understanding the overall structure in this way. I've been personally responsible for building those problems when I was an application developer myself, so this shouldn't be read as criticism of front-end developers - just a vote in favour of cross-functional and collaborative analysis and modelling before development approaches and designs are decided.

Should my DAOs (Database Entities) Directly match my UI Objects?

I am trying to figure out best practice for N-Tier application design. When designing the objects my UI needs and those that will be persisted in the DB some of my colleagues are suggesting that the objects be one in the same. This doesn't not feel right to me and I am ultimately looking for some best practice documentation to help me in this decision.
EDIT:
Let me clarify this by saying that the tables (Entity Classes) that are in the DB are identical to the objects used in the UI
I honestly do not understand why I would want to design this way given that other applications may want to interact with my Data Access Layer....or it is just ignorance or lack of understanding on my part.
Any documentation, information you could provide would be greatly appreciated. Just want to better understand these concepts and I am having a hard time finding some good information on the best practice for implementing these patterns (Or it is right in front of me on what I found and I didn't understand what was being outlined).
Thanks,
S
First of all, DAOs and database entities are two very different things.
Now to the question. You're right. The database entities are mapped to a database schema, and this database schema should follow the database design best practices, and be normalized. The UI sometimes dislays exactly the information from a given entity, but often show data that comes from multiple entities, in an aggregate format. Or, to the contrary, they only show a small part of a given entity.
For example, it would make sense for a UI to show a product name, description and price along with the name of its category, along with the number of remaining items in stock, along with the manufacturer of the product. It would make no sense to have a persistent entity containing all those fields.
In general and according to most "best practices" comments, yes, those two layers should be decoupled and there should be separate objects.
BUT: if your mapping would only be a one-to-one-mapping without any further functionality in the non-database-object, why introduce an additional object? So, it depends. (as usual ;-) ).
Don't use additional objects if the introduced overhead is bigger than the gain. And don't couple the two layers if re-usability is a first-class-goal. That may not be the case with some legacy applications, e.g.

How to maintain/generate tables in Hibernate for multi-user purpose?

I'm working on a project using Play Framework that requires me to create a multi-user application. I've a central panel where we add a certain workshop for a team. Thing is, I don't know if this is the best way, but I want to generate the tables like
team1_tablename
team1_secondtable..
Then when a certain request hits using the virtual host (e.x. http://teamawesome.workshop.com) I would need to maneuver the query to THAT certain table.
The problem is not generating the tables, but working with the models. All the workshops are going to have the same generic tables. In the model I would have to state the table, etc but then if this was PHP with doctrine I would have a template created them after creating the workshop team1, but in java even if I generate them I would have to compile them too which requires me to do more research.
My question is more Hibernate oriented before jumping the gun here and giving up on possible solutions. I'm all ears
I've thought of using NamedQueries, I don't know if I misread but I read in a hibernate book that you could query then add the result to a generic model so then I use that model to retain all my results...
If there are any doubts let me know, thanks (note this is not a multi database question, just using different sets of tables with unique prefixes)
I wonder if you could use one single set of tables, but have something like TEAM_ID as a foreign key in each table.
You would need one single TEAM table, where TEAM_ID will be the primary key. This will get migrated to tables and become part of foreign keys.
For instance, if you have a Player entity, having a collection of HighScores, then in the DB the Player table will have a TEAM_ID (foreign key from the Team table) and the HighScores table will have a composed foreign key (Player_id, Team_id) coming from the Player table..
So, bottom line, I am suggesting a logical partitioning of your database rather then a physical one (as you've considered initially).
Hope this makes sense, it definitely needs more thought, but if you think it's an interesting idea, I can think it through in more detail.
I am familiar with Hibernate and another web framework, here is how I would handle it:
I would create a single set of tables for one team that would address all my needs. Then I would:
Using DB2: Create a schema for each team copying the set of tables into each schema.
Using MySQL: Create a new Database for each copying the set of tables into each one.
Note: A 'database' in MySQL is more like a schema in other databases. (Sorry I'd rather keep things too simple than miss the point)
Now you can set up a separate hibernate.cfg.xml file for each connection (this isn't exactly the best way but perhaps best to start because it's so easy). Now you can specify the connection parameters... including the schema/db. Now your entity table, lets say it's called "team" will use the "team" table where ever it is connected...
To get started very quickly, when a user logs on create a user object in their session.
The user object will have a Hibernate SessionFactory which will be used for all database requests built from the correct hibernate.cfg.xml file as determined by parsing the URL used in the login.
Once the above is working... There are some serious efficiency concerns to address. That being that each logged on user is creating a SessionFactory... Maybe it isn't an issue if there isn't a lot of concurrent use but you probably want to look into Spring at that point and use a connection pool per team. This way there is one Session factory per team and there is no major object creation when a user signs in.
The benefits of this solution is that it should be easier to create new sets of tables because each table set lives in it's own world. There will only be one set of Entity Classes as opposed to the product of one for every team and table. The database schema stays rather simple not being complicated by adding team names and then the required constraints. If the teams require data ownership and privacy it will be rather easy to move the database to a different location.
The down side is that if the model needs to be changed for a team it must be done for each team (as opposed to a single table set using teamName as a foreign key).
The idea of using different tables for each team (despite what successful apps may use it) is honestly quite naïve, and has serious pitfalls when you take maintenance into account...
Just think what you will be forced to do if you discover you need a new table or even just an index... you'll end up needing to write DML scripts as templates and to use some (custom) software to run them on all the teams...
As mentioned in the other answers (Quaternion's and Octav's), I think you have two viable options:
Bring the "team" into your data model
Split the data in different databases/schemas
To choose the option that works best for you, you must decide if the "team" is really something you can partition your dataset into, or if it is really one more entity you want to bring into your datamodel.
You may have noticed that I'm using "splitting" here instead of "partitioning" - that's because the latter term is generally used by DBAs to indicate what we could call "sharding" - "splitting" is intended to be a stronger term.
Splitting is only viable if:
entities in different partitions do not ever need to reference each other
no query will ever need to access data from different partitions (this applies to queries used for reporting too)
As you might well see, splitting in this sense is not very attractive (maybe it could be ok now, but what when you find yourself wanting to add new features?), so my advice is to go for the "the Team is an entity" solution.
Also note that maintaining a set of databases/schemas is actually harder than maintaining a single (albeit maybe a bit more complex) database... again, think of what steps you should take to add an index in a production system...
The only downside of the single-databse solution manifests if you end up having multiple front-ends (maybe due to customizations for particular customers): changes to a shared database have the potential to affect all the applications using it, so you may need to coordinate upgrades to the different webapps to minimize risks (note, however, that in most cases you'll be able to change the database without breaking compatibility).
After all it's a little bit frustrating to get no information just shoot into the dark. Nevertheless now I have start the work, I try to finish.
I think you could do you job with following solution:
Wrote a PlayPlugin and make sure you add to every request the team to the request args. Then you wrote your own NamingStrategy. In the NamingStrategy you could read the request.args and put the team into your table name. Depending on how you add it Team_ or Team. it will be your preferred solution or something with schema. It sounds that you have an db-schema so it would be probably the best solution to stay with this tables and don't migrate.
Please make the next time your request more abstract so that you can provide some information like how many tables, is team an entity and how much records a table has (max, avg, min). How stable is your table model? This are all questions which helps to give a clear recommendation with arguments.
You can try the module vhost, but it seems not very good maintained. But I think the idea to put the name of the team into the table name is really weired. Postgres and Oracle has schemas for that. So you use myTeam.myTable. But then you must do the persistence by your selves.
Another approach would be different databases, but again you don't have good support by play. I would try this
Run for each team a separate play-server, if you don't have to much teams.
Put a reference to a Team-table for every model. Then you can use hibernate-filters or add it manually as additional parameter to each query. Of course this increase your performance. You can fix this issue with oracle partitions.

Where do you put your dictionary data?

Let's say I have a set of Countries in my application. I expect this data to change but not very often. In other words, I do not look at this set as an operational data (I would not provide CRUD operations for Country, for example).
That said I have to store this data somewhere. I see two ways to do that:
Database driven. Create and populate a Country table. Provide some sort of DAO to access it (findById() ?). This way client code will have to know Id of a country (which also can be a name or ISO code). On the application side I will have a class Country.
Application driven. Create an Enum where I can list all the Countries known to my system. It will be stored in DB as well, but the difference would be that now client code does not have to have lookup method (findById, findByName, etc) and hardcode Id, names or ISO codes. It will reference particular country directly.
I lean towards second solution for several reasons. How do you do this?
Is this correct to call this 'dictionary data'?
Addendum: One of the main problems here is that if I have a lookup method like findByName("Czechoslovakia") then after 1992 this will return nothing. I do not know how the client code will react on it (after all it sorta expects always get the Country back, because, well, it is a dictionary data). It gets even worse if I have something like findById(ID_CZ). It will be really hard to find all these dependencies.
If I will remove Country.Czechoslovakia from my enum, I will force myself to take care of any dependency on Czechoslovakia.
In some applications I've worked on there has been a single 'Enum' table in the database that contained all of this type of data. It simply consisted of two columns: EnumName and Value, and would be populated like this:
"Country", "Germany"
"Country", "United Kingdom"
"Country", "United States"
"Fruit", "Apple"
"Fruit", "Banana"
"Fruit", "Orange"
This was then read in and cached at the beginning of the application execution. The advantages being that we weren't using dozens of database tables for each distinct enumeration type; and we didn't have to recompile anything if we needed to alter the data.
This could easily be extended to include extra columns, e.g. to specify a default sort order or alternative IDs.
This won't help you, but it depends...
-What are you going to do with those countries ?
Will you store them in other tables in the DB / what will happen with existing data if you add new countries / will other applications access to those datas ?
-Are you going to translate the contry names in several languages ?
-Will the business logic of your application depend on the choosen country ?
-Do you need a Country class ?
etc...
Without more informations I would start with an Enum with a few countries and refactor depending on my needs...
If it's not going to change very often and you can afford to bring the application down to apply updates, I'd place it in a Java enumeration and write my own methods for findById(), findByName() and so on.
Advantages:
Fast - no DB access for invariant data (or caching requirement);
Simple;
Plays nice with refactoring tools.
Disadvantages:
Need to bring down the application to update.
If you place the data in its own jarfile, updating is as simple as updating the jar and restarting the application.
The hardcoding concern can be made to go away either by consumers storing a value of the enumeration itself, or by referencing the ISO code which is unlikely to change for countries...
If you're worried about keeping this enumeration "in synch" with the database, write an integration test that checks exactly that and run it regularly (eg: on your CI machine).
Personally, I've always gone for the database approach, mostly because I'm already storing other information in the database so writing another DAO is easy.
But another approach might be to store it in a properties file in the jar? I've never done it that way in Java, but it seems to be common in iPhone development (something I'm currently learning).
I'd probably have a text file embedded into my jar. I'd load it into memory on start-up (or on first use.) At that point:
It's easy to change (even by someone with no programming knowledge)
It's easy to update even without full redeployment - put just the text file somewhere on the class path
No database access required
EDIT: Okay, if you need to refer to the particular country data from code, then either:
Use the enum approach, which will always mean redeployment
Use the above approach, but keep an enum of country IDs and then have a unit test to make sure that each ID is mapped in the text file. That means you could change the rest of the data without redeployment, and a non-technical person can still update the data without seeing scary code everywhere.
Ultimately it's a case of balancing pros and cons - if the advantages above aren't relevant for you (e.g. there'll always be a coder on hand, and deployment isn't an issue) then an enum makes sense.
One of the advantages of using a database table is you can put foreign key constraints in. That way your referential integrity will always be intact. No need to run integration tests as DanVinton suggested for enums, it will never get out of sync.
I also wouldn't try making a general enum table as saw-lau suggested, mainly because you lose clean foreign key constraints, which is the main advantage of having them in the DB in the first place (might was well stick them in a text file). Databases are good at handling lots of tables. Prefix the table names with "ENUM_" if you want to distinguish them in some fashion.
The app can always load them into a Map as start-up time or when triggered by a reload event.
EDIT: From comments, "Of course I will use foreign key constraints in my DB. But it can be done with or without using enums on app side"
Ah, I missed that bit while reading the second bullet point in your question. However I still say it is better to load them into a Map, mainly based on DRY. Otherwise, when whoever has to maintain it comes to add a new country, they're surely going to update in one place but not the other, and be scratching their heads until they figure out that they needed to update it in two different places. A case of premature optimisation. The performance benefit would be minimal, at the cost of less maintainable code, IMHO.
I'd start off doing the easiest thing possible - an enum. When it comes to the point that countries change almost as frequently as my code, then I'd make the table external so that it can be updated without a rebuild. But note when you make it external you add a whole can of UI, testing and documentation worms.

Categories