How to persist every new entity?

How to persist every new entity? - java

I expect every instantiated entity to correspond to a tuple (& co) in the database. In the examples I see around, one always instantiates the entity (via a constructor) and then calls persist with that entity. I find this error-prone, and was wondering if it wasn't possible to have every instantiated entity automatically managed/persisted/reflected to the database (at least intended to).
This also seems to prevent me from persisting instance variable entities. I.e. I've an entity which instantiates another (entities it has an association with) in its constructor.

That's just a practice. The model shouldn't be aware of any DAO/persistence logic. If it does, then it is tight coupled and not reuseable for another persistence frameworks. However, if you are sure that you stick to JPA for ages, then you may consider to do so. But this is generally just not a good practice. The model may not be reuseable in another layers then. For example, you may want create a mock/dummy model object for the view layer to let a new user fill in the registration details and then only persist it when submission and validation is done succesfully.

Related

JPA Spring Data entity to be used outside of transaction

I have a Spring Boot application with a service that returns a Spring Data entity that is exposed to a controller. The problem is that I know it's not a good idea to use entities outside of DB transactions, so what would be the best practices?
Consider the following service:
#Transactional
public MyData getMyData(Long id) {
return myDataRepository.findById(id);
}
where MyData is a database #Entity and myDataRepository is a JpaRepository
This service method is called from a controller class, that sends this object in JSON format to a client that calls this method.
#RequestMapping("/")
public ResponseEntity<?> getMyData(#RequestParam Long id) {
return myService.getMyData(id);
}
If I expose MyData to a controller, then it will be exposed outside of a transaction and might cause all kind of hibernate errors. What are the best practices for these scenarios? Should I convert entity to POJO in side the service and return MyDataPOJO instead of MyData in MyService?

Using entities outside of transactions does not necessarily lead to problems; it may actually have valid use cases. However, there's quite a few variables at play and once you let them out of your sight things may and will go south. Consider the following scenarios:
Your entity doesn't have any relationships to other entities or those relationships are pretty shallow and eagerly fetched. You retrieve that entity from repository, detach it from persistence unit (implicitly or explicitly) and pass to controller. Controller does not attempt to modify the entity; it only serializes it into JSON - totally safe.
Same as above but controller modifies the entity before serializing it into JSON - again, totally safe (just don't expect those changes to be reflected in DB)
Same as above, but you've forgotten to detach the entity from PU - ouch, if controller changes the entity you may either see it reflected in DB or get transaction closed exception; both most likely being unintended consequences.
Same as above, but some of entity's relationships are lazy. Again, you may or may not get any exceptions depending on whether these lazy properties are being accessed or not.
And there are so many more combinations of intentional and unintentional design choices...
As you may see, things can get out of control very quickly. Especially so when your model has to evolve: before long you're going to find yourself fiddling with JSON views, #JsonIgnore, entity projections and so on. Thus the rule of thumb: although it may seem tempting to cut some corners and expose your entities to external layers, it's rarely a good idea. Properly designed solution always has a clear separation of concerns between layers:
Persistence layer never exposes more methods or entities than required by business logic. More over, the same table(s) can and should be mapped into several different entities depending on the use cases they participate in.
Business logic layer (btw this is your API, not the REST services! see below) never leaks any details from persistence layer. Its methods clearly define use cases from the problem domain.
Presentation layer only translates API provided by business logic into one or another form suitable for client and never implements additional use cases. Keep in mind that REST controllers, SOAP services etc logically are all part of presentation layer, not business logic.
So yeah, the short answer is: persistence entities should not be exposed to external layers. One common technique is to use DTOs instead; besides, DTO objects provide additional abstraction layer in case you need to change your entities but leave API intact or vice versa. If at some point your DTOs happen to closely resemble your entities, there are Java bean mapping frameworks like Dozer, Orika, MapStruct, JMapper, ModelMapper etc that help to eliminate the boilerplate code.
Try googling "hexagonal architecture". This is a very interesting concept for designing cleanly separated layers. Here's one of the articles on this subject https://blog.octo.com/en/hexagonal-architecture-three-principles-and-an-implementation-example/; it uses C# examples but they're pretty simple.

You should never leak the internal model to outside resources (in your case - the #RestController). The "POJO" you mentioned is typically called a DTO (Data Transfer Object). The DTO can be defined as an interface on the Service-side and implemented on the Controller-side. The Service would then - as you described - transform the internal model into an instance of the DTO, achieving looser coupling between the Controler and the Service.
By defining the DTO-interface on the service-side, you have the additional benefit that you can optimize your persistence-acces by only fetching the data specified in the corresponding DTO-interface. There is, for example, no need to fetch the friends of a User if the #Controller does not specifically requests them, thus you do not need to perform the additional JOIN in the database (provided you use a database).

Transaction Boundary and DTO conversion with JPA

I have been wondering how this anomaly should be handled:
DTO's should be converted in the controller, the service layer does not need to know about them.
Transaction boundaries are defined by the service layer.
But how do you avoid a JPA LazyInitialization exception then? The DTO conversion might need Lazy Fetched data but is unable to as the transaction was handled by the service layer.
There are ways I can think of, but all of them are ugly. Putting the DTO conversion in the service layer seems the best to me now.

Yes, definitely it is better to manipulate DTOs in the service layer. This is especially true when updating entities with changes contained in DTOs, as otherwise you would need to get and update detached entities, pass them to service, merge them again into the persistence context, etc.
"DTO's should be converted in the controller, the service layer does not need to know about them."
Instead of this, I would say the better rule of thumb is that controllers do not need to know about entities. But you can use detached entities instead of DTOs for simple cases to avoid creating lots of small DTO classes, although I personally always use DTOs just to be consistent and to make later changes easier.

Raised 'LazyInitializationException' is just signal that some parts of data were not loaded, so the best solution will be to make multiple calls from controller method to service level and fetch all required fields for DTO.
Less elegant options are:
Its possible to detect fields that were not loaded via 'org.hibernate.Hibernate.isInitialized' method and skip them during DTO build, see here full sample:
How to test whether lazy loaded JPA collection is initialized?
You can mark controller method as transactional, there will be opened hibernate session after call to service level and so lazy loading will work.

DTOs are the model you should be working against from a layer above services. Only the service should know about the entity model. In simple degenerate cases the DTO model might look almost like the entity model which is why many people will just use the entity model. This works well until people get real requirements that will force them to change the way they use data. This is when the illusion that DTO = Entity falls apart.
A DTO is often a subset or a tranformation of the entity model. The point about the LazyInitializationException is a perfect example of when the illusion starts to crumble.
A service should return fully initialized DTOs i.e. not just some object that delegates to entity objects. There shouldn't be any lazy loading inovlved after a DTO was returned from a service. This means that you have to fetch exactly the state required for a DTO and wire that data into objects to be returned. Since that usually requires quite some boilerplate code and will sometimes result in having to duplicate logic, people tend to stick even longer to the DTO = Entity illusion by sprinkling some fetch joins here and there to make the LazyInitializationExceptions go away.
This is why I started the Blaze-Persistence Entity Views project which will give you the best of both worlds. Ease of use with less boilerplate, good performance and a safe model to avoid accidental errors. Maybe you want to give it a shot to see what it can do for you?

Is it considered bad practice to create subclasses in Java to change annotations?

For example, when writing JPA or Hibernate code, I might want to create a descendant of a domain class, say Account. The descended version represents the a form a show the user. The form only has about half the fields that are on Account. So the object I use to hold the form value should not change the other fields.
Is using inheritance to change annotations considered bad? Assuming it is not, are there any good short hands or design patterns for doing it better or more effectively?

I'd say that the case you describe here (at least how I understood it) would not be a good candidate for creating subclasses.
Basically you want to restrict the form to change only some fields/associations of an entity, right? I further assume that you don't trust the developer of the form that only the fields that should be editable are changed, hence the requirement to restrict that.
In that case, one option might be to use the DTO pattern (data transfer object): create a DTO for the form data and let the user fill its fields. Then pass the DTO to a service which updates the entity accordingly. This way you have control of which fields are editable and how the update is performed.
Another way might be to create a wrapper for the entity that throws exceptions when a setter for an uneditable field is invoked. However, that would be a runtime solution and I'd prefer the DTO approach here.
Edit:
some reasons why inheritance might prove problematic in this case:
The subclasses don't represent entities, however, you'd still have to represent the subclasses on the database (either through discriminators or additional tables)
An entity that is created by that form would always have the subclass and thus would not be editable in other places (unless you mess with the database etc.)
Entities are data containers and should not depend on the presentation. Hence having a special entity (subclass) for one UI usecase (the form in your case) would violate the single responsibility principle and abstraction between model/data and view/presentation layer.

controllers, entity classes or dao - what goes where?

With the introduction of Hibernate in my project, my code started getting really coupled, and boilerplate in many places (and it should be the other way round, right?)
I got pretty confused by a particular example. I've always considered DAO objects to be pretty generic in their nature (mostly encapsulating the basic CRUD oeprations as well as the backend storage implementation)
Unfortunately, as my entity classes started to get more complicated, I started offloading more and more logic to the DAO objects. I have a particular example:
my entity class User should have a relation called friends, which is essentially a collection of users. However, I have to map my class to a collection of UserFriendship objects instead, each of which contains a ref to the friend object, but also other specific friendship data (the date when the friendship occurred)
Now, it is easy to introduce a custom getter in the entity class, which will take the collection of UserFriendship objects and turn it into a collection of User objects instead. However, what if I need only a subset of my friends collection, say, like in paging. I cannot really do that in the entity object, because it doesn't have access to the session, right? This also applies to when I need to make a parametrized query on the relationship. The one that has the access to the session is the UserDAO. So I ended up with this
UserDAO
=> normal CRUD methods
=> getFriends(Integer offset, Integer limit);
=> a bunch of similar getters and setters responsible for managing the relationships within the User instance.
This is insane. But I cannot really do anything else. I am not aware if it is possible to declare computed properties within the entity classes, which could also be parametrized.
I could technically also wrap the DAO within the entity, and put the helper getters and setters back into the entity class, where they should be, but I am not sure whether if that is a good practice as well.
I know that the DAO should only be accessed by the controller object, and it should provide a more or less complete entity object or a set of entity objects.
I am deeply confused. More or less all of my DAO objects now couple logic that should be either in the Entity objects or in the controllers.
I am sorry if my question is a bit confusing. It is a bit hard to formulate it.

My general rules are:
in the entity classes, respect the law of Demeter: don't talk to strangers
the entity classes must not use the session
the controller/service classes must not use the session. They may navigate in the graph of entities and call DAO methods
DAO methods should be the ones using the session. Their work consists in getting, saving, merging entities and executing queries. If several queries or persistence-related actions should be executed for a single use-case, the controller/service should coordinate them, not the DAO.
This way, I can test the business logic relatively easily by mocking the DAOs, and I can test the DAOs relatively easily because they don't contain much logic. Most of the tests verify that the queries find what they're supposed to find, return them in the appropriate order, and initialize the associations that must be initialized (to avoid lazy loading exceptions in the presentation layer, where I'm using detached objects)

How to control JPA persistence in Wicket forms?

I'm building an application using JPA 2.0 (Hibernate implementation), Spring, and Wicket. Everything works, but I'm concerned that my form behaviour is based around side effects.
As a first step, I'm using the OpenEntityManagerInViewFilter. My domain objects are fetched by a LoadableDetachableModel which performs entityManager.find() in its load method. In my forms, I wrap a CompoundPropertyModel around this model to bind the data fields.
My concern is the form submit actions. Currently my form submits pass the result of form.getModelObject() into a service method annotated with #Transactional. Because the entity inside the model is still attached to the entity manager, the #Transactional annotation is sufficient to commit the changes.
This is fine, until I have multiple forms that operate on the same entity, each of which changes a subset of the fields. And yes, they may be accessed simultaneously. I've thought of a few options, but I'd like to know any ideas I've missed and recommendations on managing this for long-term maintainability:
Fragment my entity into sub-components corresponding to the edit forms, and create a master entity linking these together into a #OneToOne relationship. Causes an ugly table design, and makes it hard to change forms later.
Detach the entity immediately it's loaded by the LoadableDetachableModel, and manually merge the correct fields in the service layer. Hard to manage lazy loading, may need specialised versions of the model for each form to ensure correct sub-entities are loaded.
Clone the entity into a local copy when creating the model for the form, then manually merge the correct fields in the service layer. Requires implementation of a lot of copy constructors / clone methods.
Use Hibernate's dynamicUpdate option to only update changed fields of the entity. Causes non-standard JPA behaviour throughout the application. Not visible in the affected code, and causes a strong tie to Hibernate implementation.

EDIT
The obvious solution is to lock the entity (i.e. row) when you load it for form binding. This would ensure that the lock-owning request reads/binds/writes cleanly, with no concurrent writes taking place in the background. It's not ideal, so you'd need to weigh up the potential performance issues (level of concurrent writes).
Beyond that, assuming you're happy with "last write wins" on your property sub-groups, then Hibernate's 'dynamicUpdate' would seem like the most sensible solution, unless your thinking of switching ORMs anytime soon. I find it strange that JPA seemingly doesn't offer anything that allows you to only update the dirty fields, and find it likely that it will in the future.
Additional (my original answer)
Orthogonal to this is how to ensure you have a transaction open when when your Model loads an entity for form binding. The concern being that the entities properties are updated at that point and outside of transaction this leaves a JPA entity in an uncertain state.
The obvious answer, as Adrian says in his comment, is to use a traditional transaction-per-request filter. This guarantees that all operations within the request occur in single transaction. It will, however, definitely use a DB connection on every request.
There's a more elegant solution, with code, here. The technique is to lazily instantiate the entitymanager and begin the transaction only when required (i.e. when the first EntityModel.getObject() call happens). If there is a transaction open at the end of the request cycle, it is committed. The benefit of this is that there are never any wasted DB connections.
The implementation given uses the wicket RequestCycle object (note this is slightly different in v1.5 onwards), but the whole implementation is in fact fairly general, so and you could use it (for example) outwith wicket via a servlet Filter.

After some experiments I've come up with an answer. Thanks to #artbristol, who pointed me in the right direction.
I have set a rule in my architecture: DAO save methods must only be called to save detached entities. If the entity is attached, the DAO throws an IllegalStateException. This helped track down any code that was modifying entities outside a transaction.
Next, I modified my LoadableDetachableModel to have two variants. The classic variant, for use in read-only data views, returns the entity from JPA, which will support lazy loading. The second variant, for use in form binding, uses Dozer to create a local copy.
I have extended my base DAO to have two save variants. One saves the entire object using merge, and the other uses Apache Beanutils to copy a list of properties.
This at least avoids repetitive code. The downsides are the requirement to configure Dozer so that it doesn't pull in the entire database by following lazy loaded references, and having yet more code that refers to properties by name, throwing away type safety.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.