How to test entities which have not null foreign key constraints

How to test entities which have not null foreign key constraints - java

I have a dozen of tables like Product, Category, Customer, Order, ...
with not null relationships to one another. I created ORM and right now I am in the middle of tests.
However I find it pretty tedious, because in order to test for example Order entity which has to belong to Customer (namely perform persist operation) I have to create Customer instance as well. Lets go further: because Order cannot exist separately to Product I have to create Product and add it to Order. Product has to be in some Category, and so on. So you can see a chain of mandatory relationships, which makes testing individual entity very difficult.
Natural solution would be to defer constraint check to commit (Oracle):
alter session set constraints=deferred;
However I found a piece of information that Hibernate doesn't care of deferring (support for deferred constraint).
Does it mean, that persistence testing has to be so problematic or I can do it better/different?
I believe that db constraints are sacred, so resigning from them, because hibernate does not support it sounds bad.

You can create factories for test purposes. Their job is to create a graph of objects that are valid. Such class can be used in any kind of testing (not only the repositories):
ProductFactory.withType(...).withOtherImportantOption(...).create()
Such factory could leverage randomization if you wish, but it's not mandatory. Though some amount of randomization will have to be introduced for fields with unique constraints.
PS: not satisfying FK seems like a wrong path - you'll eventually need tests that persist many objects, maybe even commit transactions. And you may also want to test cascades.

Why not just create your test database prepopulated with a bunch of test data.
You can define test data in a file named import.sql, or specified via javax.persistence.sql-load-script-source, and it will be automatically loaded by the schema export tool.
UPDATE: To be clear, you only need a couple of rows of test data in each table, that you'll use in order to set up valid references from the objects you're testing.
For example, if you want to test persisting a Book, you would write:
Book b = new Book();
b.setTitle("Feersum Endjin");
b.setAuthor( em.getReference(Author.class, AUTHOR_ID) );
em.persist(b);
Where AUTHOR_ID is the id of a row of test data.
This is a lot better, IMO, than writing tests that work with objects in an inconsistent state (i.e. with null attributes) and test data that violates the database constraints.

If you're using JPA, then your repositories are tested automatically. If you really need to, then have a look here.
Hope this answers your question.

Related

should dao (or perhaps repository) take id's or entities as arguments

In our code base we make extensive use of DAOs. In essence a layer that exposes a low level read/write api and where each DAO maps to a table in the database.
My question is should the dao's update methods take entity id's or entity references as arguments if we have different kinds of updates on an entity.
For example, say we have customers and adressess. We could have
customer.address = newAddress;
customerDao.updateCustomerAddress(customer);
or we could have
customerDao.updateCustomerAddress(customer.getId(), newAddress);
Which approach would you say is better?
The latter is more convenient since if we have the entity we always have the id, so it will always work. The converse is not always the case though, but would have to be preceded with getting the entity before performing the update.

In DDD we have Aggregates and Repositories. Aggregates ensure that the business invariants hold and Repositories handle the persistence.
I recommend that Aggregates should be pure, with no dependencies to any infrastructure code; that is, Aggregates should not know anything about persistence.
Also, you should use the Ubiquitous language in your domain code. That being said, your code should look like this (in the application layer):
customer = customerRepository.loadById(customerId);
customer.changeAddress(address);
customerRepository.save(customer);

I assume your question is
Which approach of the two is better?
I would prefer the second approach. It states clearly what will be done. The update object will be freshly loaded and it is absolutely clear that only the address will be updated. The first approach leaves room for doubt. What happens if customer.name has a new value aswell? Will it also be update?

How would I audit the changes to a list of JPA entities?

I've got two lists of entities: One that is the current state of the rows in the DB, the other is the changes that were made to the list. How do I audit the rows that were deleted, added, and the changes made to the entities? My audit table is used by all the entities.
Entity listeners and Callback methods look like a perfect fit, until you notice the sentence that says: A callback method must not invoke EntityManager or Query methods! Because of this restriction, I can collect audits, but I can't persist them to the database :(
My solution has been a complex algorithm to discover the audits.
If the entity is in the change list and has no key, it's an add
If the entity is in the db but not the changes list, it's a delete
If the entity is in both list, recursively compare their fields to find differences to audit (if any)
I collect these and insert them into the DB in the same transaction I merge the changes list. But I hate the fact that I'm writing this by hand. It seems like JPA should be able to do this logic for me.
One solution we've come up with is to use an Entity Listener that posts the audits to a JMS queue. The queue then inserts the audits into the database. But I don't like this solution because I think setting up a JMS queue is a pain. It's currently the best solution we've got though.
I'm using eclipselink (ideally, that's not relevant) and have found these two things that look helpful but the JMS queue is a better solution than them:
http://wiki.eclipse.org/EclipseLink/FAQ/JPA#How_to_access_what_changed_in_an_object_or_transaction.3F This looks really difficult to use. You search for the fields by a string. So if I refactor my entity and forget to update this, it'll throw a runtime error.
http://wiki.eclipse.org/EclipseLink/Examples/JPA/History This isn't consistent with the way we currently audit. It expects a special entity_history table.

The EntityListener looks like a good approach since you are able to collect the audit information.
Have you tried persisting the information in a different transaction than the one persisting the changes? perhaps obtaining a reference to a Stateless EJB (assuming you are using EJBs) and using methods marked with #TransactionAttribute(TransactionAttributeType.REQUIRES_NEW). In this way the transaction persisting the original changes is put on hold while the transaction of the audit completes. Note that you will not be able to access the updated information in this separate audit transaction, since the original one has not committed yet.

JPA merge in a RESTful web application with DTOs and Optimistic Locking?

My question is this: Is there ever a role for JPA merge in a stateless web application?
There is a lot of discussion on SO about the merge operation in JPA. There is also a great article on the subject which contrasts JPA merge via a more manual Do-It-Yourself process (where you find the entity via the entity manager and make your changes).
My application has a rich domain model (ala domain-driven design) that uses the #Version annotation in order to make use of optimistic locking. We have also created DTOs to send over the wire as part of our RESTful web services. The creation of this DTO layer also allows us to send to the client everything it needs and nothing it doesn't.
So far, I understand this is a fairly typical architecture. My question is about the service methods that need to UPDATE (i.e. HTTP PUT) existing objects. In this case we have these two approaches 1) JPA Merge, and 2) DIY.
What I don't understand is how JPA merge can even be considered an option for handling updates. Here's my thinking and I am wondering if there is something I don't understand:
1) In order to properly create a detached JPA entity from a wire DTO, the version number must be set correctly...else an OptimisticLockException is thrown. But the JPA spec says:
An entity may access the state of its version field or property or
export a method for use by the application to access the version, but
must not modify the version value[30]. Only the persistence provider
is permitted to set or update the value of the version attribute in
the object.
2) Merge doesn't handle bi-directional relationships ... the back-pointing fields always end up as null.
3) If any fields or data is missing from the DTO (due to a partial update), then the JPA merge will delete those relationships or null-out those fields. Hibernate can handle partial updates, but not JPA merge. DIY can handle partial updates.
4) The first thing the merge method will do is query the database for the entity ID, so there is no performance benefit over DIY to be had.
5) In a DYI update, we load the entity and make the changes according to the DTO -- there is no call to merge or to persist for that matter because the JPA context implements the unit-of-work pattern out of the box.
Do I have this straight?
Edit:
6) Merge behavior with regards to lazy loaded relationships can differ amongst providers.

Using Merge does require you to either send and receive a complete representation of the entity, or maintain server side state. For trivial CRUD-y type operations, it is easy and convenient. I have used it plenty in stateless web apps where there is no meaningful security hazard to letting the client see the entire entity.
However, if you've already reduced operations to only passing the immediately relevant information, then you need to also manually write the corresponding services.
Just remember that when doing your 'DIY' update you still need to pass a Version number around on the DTO and manually compare it to the one that comes out of the database. Otherwise you don't get the Optimistic Locking that spans 'user think-time' that you would have if you were using the simpler approach with merge.
You can't change the version on an entity created by the provider, but when you have made your own instance of the entity class with the new keyword it is fine and expected to set the version on it.
It will make the persistent representation match the in-memory representation you provide, this can include making things null. Remember when an object is merged that object is supposed to be discarded and replaced with the one returned by merge. You are not supposed to merge an object and then continue using it. Its state is not defined by the spec.
True.
Most likely, as long as your DIY solution is also using the entity ID and not an arbitrary query. (There are other benefits to using the 'find' method over a query.)
True.

I would add:
7) Merge translates to insert or to update depending on the existence of the record on DB, hence it does not deal correctly with update-vs-delete optimistic concurrency. That is, if another user concurrently deletes the record and you update it, it must (1) throw a concurrency exception... but it does not, it just inserts the record as new one.
(1) At least, in most cases, in my opinion, it should. I can imagine some cases where I would want this use case to trigger a new insert, but they are far from usual. At least, I would like the developer to think twice about it, not just accept that "merge() == updateWithConcurrencyControl()", because it is not.

How to maintain/generate tables in Hibernate for multi-user purpose?

I'm working on a project using Play Framework that requires me to create a multi-user application. I've a central panel where we add a certain workshop for a team. Thing is, I don't know if this is the best way, but I want to generate the tables like
team1_tablename
team1_secondtable..
Then when a certain request hits using the virtual host (e.x. http://teamawesome.workshop.com) I would need to maneuver the query to THAT certain table.
The problem is not generating the tables, but working with the models. All the workshops are going to have the same generic tables. In the model I would have to state the table, etc but then if this was PHP with doctrine I would have a template created them after creating the workshop team1, but in java even if I generate them I would have to compile them too which requires me to do more research.
My question is more Hibernate oriented before jumping the gun here and giving up on possible solutions. I'm all ears
I've thought of using NamedQueries, I don't know if I misread but I read in a hibernate book that you could query then add the result to a generic model so then I use that model to retain all my results...
If there are any doubts let me know, thanks (note this is not a multi database question, just using different sets of tables with unique prefixes)

I wonder if you could use one single set of tables, but have something like TEAM_ID as a foreign key in each table.
You would need one single TEAM table, where TEAM_ID will be the primary key. This will get migrated to tables and become part of foreign keys.
For instance, if you have a Player entity, having a collection of HighScores, then in the DB the Player table will have a TEAM_ID (foreign key from the Team table) and the HighScores table will have a composed foreign key (Player_id, Team_id) coming from the Player table..
So, bottom line, I am suggesting a logical partitioning of your database rather then a physical one (as you've considered initially).
Hope this makes sense, it definitely needs more thought, but if you think it's an interesting idea, I can think it through in more detail.

I am familiar with Hibernate and another web framework, here is how I would handle it:
I would create a single set of tables for one team that would address all my needs. Then I would:
Using DB2: Create a schema for each team copying the set of tables into each schema.
Using MySQL: Create a new Database for each copying the set of tables into each one.
Note: A 'database' in MySQL is more like a schema in other databases. (Sorry I'd rather keep things too simple than miss the point)
Now you can set up a separate hibernate.cfg.xml file for each connection (this isn't exactly the best way but perhaps best to start because it's so easy). Now you can specify the connection parameters... including the schema/db. Now your entity table, lets say it's called "team" will use the "team" table where ever it is connected...
To get started very quickly, when a user logs on create a user object in their session.
The user object will have a Hibernate SessionFactory which will be used for all database requests built from the correct hibernate.cfg.xml file as determined by parsing the URL used in the login.
Once the above is working... There are some serious efficiency concerns to address. That being that each logged on user is creating a SessionFactory... Maybe it isn't an issue if there isn't a lot of concurrent use but you probably want to look into Spring at that point and use a connection pool per team. This way there is one Session factory per team and there is no major object creation when a user signs in.
The benefits of this solution is that it should be easier to create new sets of tables because each table set lives in it's own world. There will only be one set of Entity Classes as opposed to the product of one for every team and table. The database schema stays rather simple not being complicated by adding team names and then the required constraints. If the teams require data ownership and privacy it will be rather easy to move the database to a different location.
The down side is that if the model needs to be changed for a team it must be done for each team (as opposed to a single table set using teamName as a foreign key).

The idea of using different tables for each team (despite what successful apps may use it) is honestly quite naïve, and has serious pitfalls when you take maintenance into account...
Just think what you will be forced to do if you discover you need a new table or even just an index... you'll end up needing to write DML scripts as templates and to use some (custom) software to run them on all the teams...
As mentioned in the other answers (Quaternion's and Octav's), I think you have two viable options:
Bring the "team" into your data model
Split the data in different databases/schemas
To choose the option that works best for you, you must decide if the "team" is really something you can partition your dataset into, or if it is really one more entity you want to bring into your datamodel.
You may have noticed that I'm using "splitting" here instead of "partitioning" - that's because the latter term is generally used by DBAs to indicate what we could call "sharding" - "splitting" is intended to be a stronger term.
Splitting is only viable if:
entities in different partitions do not ever need to reference each other
no query will ever need to access data from different partitions (this applies to queries used for reporting too)
As you might well see, splitting in this sense is not very attractive (maybe it could be ok now, but what when you find yourself wanting to add new features?), so my advice is to go for the "the Team is an entity" solution.
Also note that maintaining a set of databases/schemas is actually harder than maintaining a single (albeit maybe a bit more complex) database... again, think of what steps you should take to add an index in a production system...
The only downside of the single-databse solution manifests if you end up having multiple front-ends (maybe due to customizations for particular customers): changes to a shared database have the potential to affect all the applications using it, so you may need to coordinate upgrades to the different webapps to minimize risks (note, however, that in most cases you'll be able to change the database without breaking compatibility).

After all it's a little bit frustrating to get no information just shoot into the dark. Nevertheless now I have start the work, I try to finish.
I think you could do you job with following solution:
Wrote a PlayPlugin and make sure you add to every request the team to the request args. Then you wrote your own NamingStrategy. In the NamingStrategy you could read the request.args and put the team into your table name. Depending on how you add it Team_ or Team. it will be your preferred solution or something with schema. It sounds that you have an db-schema so it would be probably the best solution to stay with this tables and don't migrate.
Please make the next time your request more abstract so that you can provide some information like how many tables, is team an entity and how much records a table has (max, avg, min). How stable is your table model? This are all questions which helps to give a clear recommendation with arguments.

You can try the module vhost, but it seems not very good maintained. But I think the idea to put the name of the team into the table name is really weired. Postgres and Oracle has schemas for that. So you use myTeam.myTable. But then you must do the persistence by your selves.
Another approach would be different databases, but again you don't have good support by play. I would try this
Run for each team a separate play-server, if you don't have to much teams.
Put a reference to a Team-table for every model. Then you can use hibernate-filters or add it manually as additional parameter to each query. Of course this increase your performance. You can fix this issue with oracle partitions.

Hiding deleted objects

I have the following use case: There's a class called Template and with that class I can create instances of the ActualObject class (ActualObject copies its inital data from the Template). The Template class has a list of Product:s.
Now here comes the tricky part, the user should be able to delete Products from the database but these deletions may not affect the content of a Template. In other words, even if a Product is deleted, the Template should still have access to it. This could be solved by adding a flag "deleted" to the Product. If a Product is deleted, then it may not be searched explicitly from the database, but it can be fetched implicitly (for example via the reference in the Template class).
The idea behind this is that when an ActualObject is created from a template, the user is notified in the user interface that "The Template X had a Product Z with the parameters A, B and C, but this product has been deleted and cannot be added as such in ActualObject Z".
My problem is how I should mark these deleted objects as deleted. Before someone suggests that just update the delete flag instead of doing an actual delete query, my problem is not that simple. The delete flag and its behaviour should exist in all POJOs, not just in Product. This means I'll be getting cascade problems. For example, if I delete a Template, then the Products should also be deleted and each Product has a reference to a Price-object which also should be deleted and each Price may have a reference to a VAT-object and so forth. All these cascaded objects should be marked as deleted.
My question is how can I accomplish this in a sensible manner. Going through every object (which are being deleted) checking each field for references which should be deleted, going through their references etc is quite laborious and bugs are easy to slip in.
I'm using Hibernate, I was wondering if Hibernate would have any such inbuilt features. Another idea that I came to think of was to use hibernate interceptors to modify an actual SQL delete query to an update query (I'm not even 100% sure this is possible). My only concern is that does Hibernate rely on cascades in foreign keys, in other words, the cascaded deletes are done by the database and not by hibernate.

My problem is how I should mark these
deleted objects as deleted.
I think you have choosen a very complex way to solve the task. It would be more easy to introduce ProductTemplate. Place into this object all required properties you need. And also you need here a reference to a Product instance. Than instead of marking Product you can just delete it (and delete all other entities, such as prices). And, of course, you should clean reference in ProductTemplate. When you are creating an instance of ActualObject you will be able to notify the user with appropriate message.

I think you're trying to make things much more complicated than they should be... anyway, what you're trying to do is handling Hibernate events, take a look at Chapter 12 of Hibernate Reference, you can choose to use interceptors or the event system.
In any case... well good luck :)

public interface Deletable {
public void delete();
}
Have all your deletable objects implement this interface. In their implementations, update the deleted flag and have them call their children's delete() method also - which implies that the children must be Deletable too.
Of course, upon implementation you'll have to manually figure which children are Deletable. But this should be straightforward, at least.

If I understand what you are asking for, you add an #OneToMany relationship between the template and the product, and select your cascade rules, you will be able to delete all associated products for a given template. In your product class, you can add the "deleted" flag as you suggested. This deleted flag would be leveraged by your service/dao layer e.g. you could leverage a getProdcuts(boolean includeDeleted) type concept to determine if you should include the "deleted" records for return. In this fashion you can control what end users see, but still expose full functionality to internal business users.

The flag to delete should be a part of the Template Class itself. That way all the Objects that you create have a way to be flagged as alive or deleted. The marking of the Object to be deleted, should go higher up to the base class.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.