How to control JPA persistence in Wicket forms?

How to control JPA persistence in Wicket forms? - java

I'm building an application using JPA 2.0 (Hibernate implementation), Spring, and Wicket. Everything works, but I'm concerned that my form behaviour is based around side effects.
As a first step, I'm using the OpenEntityManagerInViewFilter. My domain objects are fetched by a LoadableDetachableModel which performs entityManager.find() in its load method. In my forms, I wrap a CompoundPropertyModel around this model to bind the data fields.
My concern is the form submit actions. Currently my form submits pass the result of form.getModelObject() into a service method annotated with #Transactional. Because the entity inside the model is still attached to the entity manager, the #Transactional annotation is sufficient to commit the changes.
This is fine, until I have multiple forms that operate on the same entity, each of which changes a subset of the fields. And yes, they may be accessed simultaneously. I've thought of a few options, but I'd like to know any ideas I've missed and recommendations on managing this for long-term maintainability:
Fragment my entity into sub-components corresponding to the edit forms, and create a master entity linking these together into a #OneToOne relationship. Causes an ugly table design, and makes it hard to change forms later.
Detach the entity immediately it's loaded by the LoadableDetachableModel, and manually merge the correct fields in the service layer. Hard to manage lazy loading, may need specialised versions of the model for each form to ensure correct sub-entities are loaded.
Clone the entity into a local copy when creating the model for the form, then manually merge the correct fields in the service layer. Requires implementation of a lot of copy constructors / clone methods.
Use Hibernate's dynamicUpdate option to only update changed fields of the entity. Causes non-standard JPA behaviour throughout the application. Not visible in the affected code, and causes a strong tie to Hibernate implementation.

EDIT
The obvious solution is to lock the entity (i.e. row) when you load it for form binding. This would ensure that the lock-owning request reads/binds/writes cleanly, with no concurrent writes taking place in the background. It's not ideal, so you'd need to weigh up the potential performance issues (level of concurrent writes).
Beyond that, assuming you're happy with "last write wins" on your property sub-groups, then Hibernate's 'dynamicUpdate' would seem like the most sensible solution, unless your thinking of switching ORMs anytime soon. I find it strange that JPA seemingly doesn't offer anything that allows you to only update the dirty fields, and find it likely that it will in the future.
Additional (my original answer)
Orthogonal to this is how to ensure you have a transaction open when when your Model loads an entity for form binding. The concern being that the entities properties are updated at that point and outside of transaction this leaves a JPA entity in an uncertain state.
The obvious answer, as Adrian says in his comment, is to use a traditional transaction-per-request filter. This guarantees that all operations within the request occur in single transaction. It will, however, definitely use a DB connection on every request.
There's a more elegant solution, with code, here. The technique is to lazily instantiate the entitymanager and begin the transaction only when required (i.e. when the first EntityModel.getObject() call happens). If there is a transaction open at the end of the request cycle, it is committed. The benefit of this is that there are never any wasted DB connections.
The implementation given uses the wicket RequestCycle object (note this is slightly different in v1.5 onwards), but the whole implementation is in fact fairly general, so and you could use it (for example) outwith wicket via a servlet Filter.

After some experiments I've come up with an answer. Thanks to #artbristol, who pointed me in the right direction.
I have set a rule in my architecture: DAO save methods must only be called to save detached entities. If the entity is attached, the DAO throws an IllegalStateException. This helped track down any code that was modifying entities outside a transaction.
Next, I modified my LoadableDetachableModel to have two variants. The classic variant, for use in read-only data views, returns the entity from JPA, which will support lazy loading. The second variant, for use in form binding, uses Dozer to create a local copy.
I have extended my base DAO to have two save variants. One saves the entire object using merge, and the other uses Apache Beanutils to copy a list of properties.
This at least avoids repetitive code. The downsides are the requirement to configure Dozer so that it doesn't pull in the entire database by following lazy loaded references, and having yet more code that refers to properties by name, throwing away type safety.

Related

Always (deep)clone before sending to other tier entites geting from entityManager?

Is there an example for a real-world application when you can send between 3 tiers (ui-business-persistance layer) the same object? For example for the sake of simplicity let it be the entity beans.
I mean if am getReference() from my entityManager and send to the user and i am let the user to be able to edit it, create it, and not copy in any point of my code?
Is there any concurrent or any know issue if you choose this option?
What is the drawback of this option??
I know there would be nice if we wrapped the object, this way we can attach, package entites, and can travel the object in one object maybe responseItem, we can attach other attributes like dirty flag, but if i want to keep simplicity is it possible? Or EntityManagers will mess up the whole thing? (i feel they will and cant handle e.g. detached objects properly so it is better to encapsulation the whole persistance and UI tier and deep copy them....)
Thanks any answer.

I found this answer in this book : Pro JPA 2
Mastering the Java™ Persistence API
I think the main reason is the concurrent thread acces for deep copy entities, referencing to this:
Applications may not access an entity directly from multiple threads
while it is managed by a persistence context. An application may
choose, however, to allow entities to be accessed concurrently when
they are detached. If it chooses to do so, the synchronization must be
controlled through the methods coded on the entity. Concurrent entity
state access is not recommended, however, because the entity model
does not lend itself well to concurrent patterns. It would be
preferable to simply copy the entity and pass the copied entity to
other threads for access and then merge any changes back into a
persistence context when they need to be persisted

JPA merge in a RESTful web application with DTOs and Optimistic Locking?

My question is this: Is there ever a role for JPA merge in a stateless web application?
There is a lot of discussion on SO about the merge operation in JPA. There is also a great article on the subject which contrasts JPA merge via a more manual Do-It-Yourself process (where you find the entity via the entity manager and make your changes).
My application has a rich domain model (ala domain-driven design) that uses the #Version annotation in order to make use of optimistic locking. We have also created DTOs to send over the wire as part of our RESTful web services. The creation of this DTO layer also allows us to send to the client everything it needs and nothing it doesn't.
So far, I understand this is a fairly typical architecture. My question is about the service methods that need to UPDATE (i.e. HTTP PUT) existing objects. In this case we have these two approaches 1) JPA Merge, and 2) DIY.
What I don't understand is how JPA merge can even be considered an option for handling updates. Here's my thinking and I am wondering if there is something I don't understand:
1) In order to properly create a detached JPA entity from a wire DTO, the version number must be set correctly...else an OptimisticLockException is thrown. But the JPA spec says:
An entity may access the state of its version field or property or
export a method for use by the application to access the version, but
must not modify the version value[30]. Only the persistence provider
is permitted to set or update the value of the version attribute in
the object.
2) Merge doesn't handle bi-directional relationships ... the back-pointing fields always end up as null.
3) If any fields or data is missing from the DTO (due to a partial update), then the JPA merge will delete those relationships or null-out those fields. Hibernate can handle partial updates, but not JPA merge. DIY can handle partial updates.
4) The first thing the merge method will do is query the database for the entity ID, so there is no performance benefit over DIY to be had.
5) In a DYI update, we load the entity and make the changes according to the DTO -- there is no call to merge or to persist for that matter because the JPA context implements the unit-of-work pattern out of the box.
Do I have this straight?
Edit:
6) Merge behavior with regards to lazy loaded relationships can differ amongst providers.

Using Merge does require you to either send and receive a complete representation of the entity, or maintain server side state. For trivial CRUD-y type operations, it is easy and convenient. I have used it plenty in stateless web apps where there is no meaningful security hazard to letting the client see the entire entity.
However, if you've already reduced operations to only passing the immediately relevant information, then you need to also manually write the corresponding services.
Just remember that when doing your 'DIY' update you still need to pass a Version number around on the DTO and manually compare it to the one that comes out of the database. Otherwise you don't get the Optimistic Locking that spans 'user think-time' that you would have if you were using the simpler approach with merge.
You can't change the version on an entity created by the provider, but when you have made your own instance of the entity class with the new keyword it is fine and expected to set the version on it.
It will make the persistent representation match the in-memory representation you provide, this can include making things null. Remember when an object is merged that object is supposed to be discarded and replaced with the one returned by merge. You are not supposed to merge an object and then continue using it. Its state is not defined by the spec.
True.
Most likely, as long as your DIY solution is also using the entity ID and not an arbitrary query. (There are other benefits to using the 'find' method over a query.)
True.

I would add:
7) Merge translates to insert or to update depending on the existence of the record on DB, hence it does not deal correctly with update-vs-delete optimistic concurrency. That is, if another user concurrently deletes the record and you update it, it must (1) throw a concurrency exception... but it does not, it just inserts the record as new one.
(1) At least, in most cases, in my opinion, it should. I can imagine some cases where I would want this use case to trigger a new insert, but they are far from usual. At least, I would like the developer to think twice about it, not just accept that "merge() == updateWithConcurrencyControl()", because it is not.

Flex Blaze DS and JPA - lazy-loading issues

I am developing an application in Flex, using Blaze DS to communicate with a Java back-end, which provides persistence via JPA (Eclipse Link).
I am encountering issues when passing JPA entities to Flex via Blaze DS. Blaze DS uses reflection to convert the JPA entity into an ObjectProxy (effectively a HashMap) by calling all getter methods on the entity. This includes any lazy-initialised one/many-to-many relationships.
You can probably see where I am going. If I pass a single object through JPA this will call all one/many-to-many methods on this object. For each returned object if they have one/many-to-many relationships they will be called too. As such, by passing back a single JPA entity I actually end up doing multiple database calls and passing all related entries back as a single ObjectProxy instance!
My solution to date is to create a translator to convert each entity to an ObjectProxy and vice-versa. This is clearly cumbersome and there must be a better way.
Thoughts please?

As an alternative, you could consider using GraniteDS instead of BlazeDS: GraniteDS has a much more powerful data management stack than BlazeDS (it competes more with LCDS) and fully support lazy-loading for all major JPA engines: Hibernate, EclipseLink, OpenJPA, etc.
Moreover, GraniteDS has a great client-side transparent lazy loading feature and even a so-called reverse lazy-loading mechanism.
And you don't need any kind of intermediate DTOs: it serializes JPA entities as is and uses code-generated ActionScript beans on the client-side to keep their initialization states.

Unfortunately, lazy-loading is not easy to accomplish with Flash clients. There are some working solutions, like dpHibernate, but so far all the different solutions I have tested fall short of what you would expect in terms of performance and ease of use.
So in my experience, it is the best and most reliable solution to always use DTOs, which adds the benefit of cleanly separating the database and view layers. This necessitates, though, that you implement either eager loading, or a second server round trip to resolve your many-to-many relations, as well as a good deal more boilerplate code to copy the DAO and DTO field values.
Which one to choose depends on your use case: Sometimes getting only the main object's fields might be enough, then you could simply omit the List of related objects from your DTO (transfer only those values you need for your query). Sometimes you may actually need the entire list of related entities, and then you could get it via eager loading, or by setting up a second remote object to find only the list.

EclipseLink also provides a copyObject() API that allows you to give a copy group of exactly what attribute you want. You could then use this copy to avoid having the relationships that you do not want.
If you have a detached object, you could just null out the fields that you do not want as well, or use a DTO.

JPA: Extending the persistence context vs. detaching entities

There appear to be two patterns to implement business transactions that span several http requests with JPA:
entity-manager-per-request with detached entities
extended persistence context
What are the respective advantages of these patterns? When should which be preferred?
So far, I came up with:
an extended persistence context guarantees that object identity is equivalent to database identity, simplifying the programming model and potentially disspelling the need to implement equals for entities
detached entities require less memory than an extended persistence context, as the persistence context also has to store the previous state of the entity for change detection
no longer referenced detached entities become eligible for garbage collection; persistent objects must first be detached explicitly
However, not having any practical experience with JPA I am sure I have missed something of importance, hence this question.
In case it matters: We intend to use JPA 2.0 backed by Hibernate 3.6.
Edit: Our view technology is JSF 2.0, in an EJB 3.1 container, with CDI and possibly Seam 3.

Well, I can enumerate challenges with trying to use extended persistence contexts in a web environment. Some things also depend on what your view technology is and if it's binding entities or view level middlemen.
EntityManagers are not threadsafe.
You don't need one per user session.
You need one per user session per
browser tab.
When an exception comes out of an
EntityManager, it is considered
invalid and needs to be closed and
replaced. If you're planning to
write your own framework extensions
for managing the extended lifecycle,
the implementation of this needs to
be bullet proof. Generally in an
EM-per-request setup the exception
goes to some kind of error page and
then loading the next page creates a
new one anyway like it always would
have.
Object equality is not going to be
100% automagically safe. As above,
an exception may have invalidated
the context an object loaded earlier
was associated with, so one fetched
now will not be equal. Making that
assumption also assumes an extremely
high level of skill and
understanding of how JPA works and
what the EM does among the
developers using it. e.g.,
accidentally using merge when it
wasn't needed will return a new
object which will not satisfy ==
with its field-identical
predecessor. (treating merge like a
SQL 'update' is an extremely common
JPA noobie 'error' particularly
because it's just a no-op most of
the time so it slides past.)
If you're using a view technology
that binds POJOs (e.g., SpringMVC)
and you're planning to bind web form
data directly onto your Entities,
you'll get in trouble quick.
Changes to an attached entity will
become persistent on the next
flush/commit, regardless of whether
they were done in a transaction or
not. Common error is, web form
comes in and binds some invalid data
onto an entity, validation fails and
trys to return a screen to inform
user. Building error screen
involves running a query. Query
triggers flush/commit of persistence
context. Changes bound to attached
entity get flushed to database,
hopefully causing SQL exception, but
maybe just persisting corrupt data.
(Problem 4 can of course also happen with session per request if the programming is sloppy, but you're not forced to actively work hard at avoiding it.)

Open Session In View Pattern

I'm asking this question given my chosen development frameworks of JPA (Hibernate implementation of), Spring, and <insert MVC framework here - Struts 1, Struts 2, Spring MVC, Stripes...>.
I've been thinking a bit about relationships in my entity layer - for example I have an order entity that has many order lines. I've set up my app so that it eagerly loads the order lines for every order. Do you think this is a lazy way to get around the lazy initialization problems that I would come across if I was to set the fetch strategy to false?
The way I see it, I have the following alternatives when retrieving entities and their associations:
Use the Open Session In View pattern to create the session on each request and commit the transaction before returning the response.
Implement a DTO (Data Transfer Object) layer such that every DAO query I execute returns the correctly initialized DTO for my purposes. I don't really like this option much because in my experience I've found that it creates a lot of boilerplate copying code and becomes messy to maintain.
Don't map any associations in JPA so that every query I execute returns only the entities I'm interested in - this will probably require me to have DTOs anyway and will be a pain to maintain and I think defeats the purpose of having an ORM in the first place.
Eagerly fetch all (or most associations) - in the example above, always fetch all order lines when I retrieve an order.
So my question is, when and under what circumstances would you use which of these options? Do you always stick with one way of doing it?
I would ask a colleague but I think that if I even mentioned the term 'Open Session in View' I would be greeted with blank stares :( What I'm really looking for here is some advice from a senior or very experienced developer.
Thanks guys!

Open Session in View has some problems.
For example, if the transaction fails, you might know it too late at commit time, once you are nearly done rendering your page (possibly the response already commited, so you can't change the page !) ... If you had know that error before, you would have followed a different flow and ended up rendering a different page...
Other example, reading data on-demand might turn to many "N+1 select" problems, that kill your performance.
Many projects use the following path:
Maintain transactions at the business layer ; load at that point everything you are supposed to need.
Presentation layer runs the risk of LazyExceptions : each is considered a programming error, caught during tests, and corrected by loading more data in the business layer (you have the opportunity to do it efficiently, avoiding "N+1 select" problems).
To avoid creating extra classes for DTOs, you can load the data inside the entity objects themselves. This is the whole point of the POJO approach (uses by modern data-access layers, and even integration technologies like Spring).

I've successfully solved all my lazy initialization problems with Open Session In View -pattern (ie. the Spring implementation). The technologies I used were the exact same as you have.
Using this pattern allows me to fully map the entity relationships and not worry about fetching child entities in the dao. Mostly. In 90% of the cases the pattern solves the lazy initialization needs in the view. In some cases you'll have to "manually" initialize relationships. These cases were rare and always involved very very complex mappings in my case.
When using Open Entity Manager In View pattern it's important to define the entity relationships and especially propagation and transactional settings correctly. If these are not configured properly, there will be errors related to closed sessions when some entity is lazily initialized in the view and it fails due to the session having been closed already.
I definately would go with option 1. Option 2 might be needed sometimes, but I see absolutely no reason to use option 3. Option 4 is also a no no. Eagerly fetching everything kills the performance of any view that needs to list just a few properties of some parent entities (orders in tis case).
N+1 Selects
During development there will be N+1 selects as a result of initializing some relationships in the view. But this is not a reason to discard the pattern. Just fix these problems as they arise and before delivering the code to production. It's as easy to fix these problems with OEMIV pattern as it's with any other pattern: add the proper dao or service methods, fix the controller to call a different finder method, maybe add a view to the database etc.

I have successfully used the Open-Session-in-View pattern on a project. However, I recently read in "Spring In Practice" of an interesting potential problem with non-repeatable reads if you manage your transactions at a lower layer while keeping the Hibernate session open in the view layer.
We managed most of our transactions in the service layer, but kept the hibernate session open in the view layer. This meant that lazy reads in the view were resulting in separate read transactions.
We managed our transactions in our service layer to minimize transaction duration. For instance, some of our service calls resulted in both a database transaction and a web service call to an external service. We did not want our transaction to be open while waiting for a web service call to respond.
As our system never went into production, I am not sure if there were any real problems with it, but I suspect that there was the potential for the view to attempt to lazily load an object that has been deleted by someone else.

There are some benefits of DTO approach though. You have to think beforehand what information you need. In some cases this will prevent you from generating n+1 select statements. It helps also to see where to use eager fetching and/or optimized views.

I'll also throw my weight behind the Open-Session-in-View pattern, having been in the exact same boat before.
I work with Stripes without spring, and have created a manual filter before that tends to work well. Coding transaction logic on the backend turns messy really quick as you've mentioned. Eagerly fetching everything becomes TERRIBLE as you map more and more objects to each other.
One thing I want to add that you may not have come across is Stripersist and Stripernate - Stripersist being the more JPA flavor - auto-hydration filters that take a lot of the work off your shoulders.
With Stripersist you can say things like /appContextRoot/actions/view/3 and it will auto-hydrate the JPA Entity on the ActionBean with id of 3 before the event is executed.
Stripersist is in the stripes-stuff package on sourceforge. I now use this for all new projects, as it's clean and easily supports multiple datasources if necessary.

Does the Order and Order Lines compose a high volume of data? Do they take part in online processes where real-time response is required? If so, you might consider not using eager fetching - it does make a huge diference in performance. If the amount of data is small, there is no problem in eager fetching.
About using DTOs, it might be a viable implementation.
If your business layer is used internally by your own application (i.e a small web app and its business logic) it'd probably be best to use your own entities in your view with open session in view pattern since it's simpler.
If your entities are used by many applications (i.e a backend application providing a service in your corporation) it'd be interesting to use DTOs since you would not expose your model to your clients. Exposing it could mean you would have a harder time refactoring your model since it could mean breaking contracts with your clients. A DTO would make that easier since you have another layer of
abstraction. This can be a bit strange since EJB3 would theorically eliminate the need of DTOs.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.