RESTfull client with JPA entities

RESTfull client with JPA entities - java

I'm going to develop an application which uses a RESTfull service and also I'm going to use JPA/Hibernate as the ORM.
I have used these 2 technologies, but not in a single app.
Since client has no state, it is meaning less to use state full entities at the Data or Service Layer. And also there are bidirectional mappings as well.
I think CASCADE option of JPA will not work, rather than it will destroy the data, if the client is making an update.
So what I'm thinking is, detach objects before serve to the client and,
If there is an update (PUT) request, just passing the parent object and update only the parent. So I cannot use CASCADE option I suppose.
When it is a delete (DELETE) request, I have to do the CASCADE operations manually.
Also I think, making relationships between entities might be a problem.
Can anyone give a explanation about this scenario?
Is this approach correct?
Is there a best-practice on a situation like this?
Thanks!

Do not mix your business entities into the web layer.
I would recommend to decouple your business layer from the web layer creating new JAXB annotatted entities to return from your REST controller.
If you want to make easy your work, there are a lot of libraries that can copy 1 bean to another. For example the Apache BeanUtils.class.

Related

JPA Spring Data entity to be used outside of transaction

I have a Spring Boot application with a service that returns a Spring Data entity that is exposed to a controller. The problem is that I know it's not a good idea to use entities outside of DB transactions, so what would be the best practices?
Consider the following service:
#Transactional
public MyData getMyData(Long id) {
return myDataRepository.findById(id);
}
where MyData is a database #Entity and myDataRepository is a JpaRepository
This service method is called from a controller class, that sends this object in JSON format to a client that calls this method.
#RequestMapping("/")
public ResponseEntity<?> getMyData(#RequestParam Long id) {
return myService.getMyData(id);
}
If I expose MyData to a controller, then it will be exposed outside of a transaction and might cause all kind of hibernate errors. What are the best practices for these scenarios? Should I convert entity to POJO in side the service and return MyDataPOJO instead of MyData in MyService?

Using entities outside of transactions does not necessarily lead to problems; it may actually have valid use cases. However, there's quite a few variables at play and once you let them out of your sight things may and will go south. Consider the following scenarios:
Your entity doesn't have any relationships to other entities or those relationships are pretty shallow and eagerly fetched. You retrieve that entity from repository, detach it from persistence unit (implicitly or explicitly) and pass to controller. Controller does not attempt to modify the entity; it only serializes it into JSON - totally safe.
Same as above but controller modifies the entity before serializing it into JSON - again, totally safe (just don't expect those changes to be reflected in DB)
Same as above, but you've forgotten to detach the entity from PU - ouch, if controller changes the entity you may either see it reflected in DB or get transaction closed exception; both most likely being unintended consequences.
Same as above, but some of entity's relationships are lazy. Again, you may or may not get any exceptions depending on whether these lazy properties are being accessed or not.
And there are so many more combinations of intentional and unintentional design choices...
As you may see, things can get out of control very quickly. Especially so when your model has to evolve: before long you're going to find yourself fiddling with JSON views, #JsonIgnore, entity projections and so on. Thus the rule of thumb: although it may seem tempting to cut some corners and expose your entities to external layers, it's rarely a good idea. Properly designed solution always has a clear separation of concerns between layers:
Persistence layer never exposes more methods or entities than required by business logic. More over, the same table(s) can and should be mapped into several different entities depending on the use cases they participate in.
Business logic layer (btw this is your API, not the REST services! see below) never leaks any details from persistence layer. Its methods clearly define use cases from the problem domain.
Presentation layer only translates API provided by business logic into one or another form suitable for client and never implements additional use cases. Keep in mind that REST controllers, SOAP services etc logically are all part of presentation layer, not business logic.
So yeah, the short answer is: persistence entities should not be exposed to external layers. One common technique is to use DTOs instead; besides, DTO objects provide additional abstraction layer in case you need to change your entities but leave API intact or vice versa. If at some point your DTOs happen to closely resemble your entities, there are Java bean mapping frameworks like Dozer, Orika, MapStruct, JMapper, ModelMapper etc that help to eliminate the boilerplate code.
Try googling "hexagonal architecture". This is a very interesting concept for designing cleanly separated layers. Here's one of the articles on this subject https://blog.octo.com/en/hexagonal-architecture-three-principles-and-an-implementation-example/; it uses C# examples but they're pretty simple.

You should never leak the internal model to outside resources (in your case - the #RestController). The "POJO" you mentioned is typically called a DTO (Data Transfer Object). The DTO can be defined as an interface on the Service-side and implemented on the Controller-side. The Service would then - as you described - transform the internal model into an instance of the DTO, achieving looser coupling between the Controler and the Service.
By defining the DTO-interface on the service-side, you have the additional benefit that you can optimize your persistence-acces by only fetching the data specified in the corresponding DTO-interface. There is, for example, no need to fetch the friends of a User if the #Controller does not specifically requests them, thus you do not need to perform the additional JOIN in the database (provided you use a database).

Transaction Boundary and DTO conversion with JPA

I have been wondering how this anomaly should be handled:
DTO's should be converted in the controller, the service layer does not need to know about them.
Transaction boundaries are defined by the service layer.
But how do you avoid a JPA LazyInitialization exception then? The DTO conversion might need Lazy Fetched data but is unable to as the transaction was handled by the service layer.
There are ways I can think of, but all of them are ugly. Putting the DTO conversion in the service layer seems the best to me now.

Yes, definitely it is better to manipulate DTOs in the service layer. This is especially true when updating entities with changes contained in DTOs, as otherwise you would need to get and update detached entities, pass them to service, merge them again into the persistence context, etc.
"DTO's should be converted in the controller, the service layer does not need to know about them."
Instead of this, I would say the better rule of thumb is that controllers do not need to know about entities. But you can use detached entities instead of DTOs for simple cases to avoid creating lots of small DTO classes, although I personally always use DTOs just to be consistent and to make later changes easier.

Raised 'LazyInitializationException' is just signal that some parts of data were not loaded, so the best solution will be to make multiple calls from controller method to service level and fetch all required fields for DTO.
Less elegant options are:
Its possible to detect fields that were not loaded via 'org.hibernate.Hibernate.isInitialized' method and skip them during DTO build, see here full sample:
How to test whether lazy loaded JPA collection is initialized?
You can mark controller method as transactional, there will be opened hibernate session after call to service level and so lazy loading will work.

DTOs are the model you should be working against from a layer above services. Only the service should know about the entity model. In simple degenerate cases the DTO model might look almost like the entity model which is why many people will just use the entity model. This works well until people get real requirements that will force them to change the way they use data. This is when the illusion that DTO = Entity falls apart.
A DTO is often a subset or a tranformation of the entity model. The point about the LazyInitializationException is a perfect example of when the illusion starts to crumble.
A service should return fully initialized DTOs i.e. not just some object that delegates to entity objects. There shouldn't be any lazy loading inovlved after a DTO was returned from a service. This means that you have to fetch exactly the state required for a DTO and wire that data into objects to be returned. Since that usually requires quite some boilerplate code and will sometimes result in having to duplicate logic, people tend to stick even longer to the DTO = Entity illusion by sprinkling some fetch joins here and there to make the LazyInitializationExceptions go away.
This is why I started the Blaze-Persistence Entity Views project which will give you the best of both worlds. Ease of use with less boilerplate, good performance and a safe model to avoid accidental errors. Maybe you want to give it a shot to see what it can do for you?

Hibernate FetchType.LAZY without session

I have many issues with LazyLoadingException in a Spring web application wherever I try to access fields that are annotated with FetchType.LAZY
There is no session configured in Spring because the requirement is that the API should be stateless.
All the service layer methods have #Transactional annotations properly set.
However when I try to access the Lazy fields on any domain object, I get the famous LazyInitializationException (...) could not initialize proxy - no Session
I thought that Hibernate would automatically load the lazy fields when needed when I'm in a #Transactional method but it appears it doesn't.
I have spent several days looking for answers but nothing fits my needs. I found that Spring could be configured with openSessionInViewFilter but it appears to cause many issues, moreover I don't have any session.
How to automatically load lazy fields in #Transactionalannotated service methods with such a stateless API ?
I'm sure I'm missing something obvious here, but I'm not very familiar with Spring and Hibernate.
Please tell me if there are missing information in my question I should give you.

LazyInitializationExceptions are a code smell in a same way EAGER fetching is too.
First of all, the fetching policy should be query-based on a business case basis. Tha DAO layer is solely responsible for fetching the right associations, so:
You should use the FETCH directive for all many-to-one associations and at most one one-to-many association. If you try to JOIN FETCH more than one one-to-many associations, you'll get a Cartesian Product and your application performance will be affected.
If you need to fetch multiple collections, then a multi-level fetching is more appropriate.
You should ask yourself why you want to return entities from the DAO layer. Using DTOs is a much better alternative, as it reduces both the amount of data that's fetched from the DB and it doesn't leak the Entity abstraction into the UI layer.

How does the Integration Tier interface with the Business Tier?

I need some advice on designing an "Integration Tier" of an N-Tiered system in Java. This tier is responsible for persisting and retrieving data for the "Business Tier" (located on a separate server). I'm new to J2EE and I've read a few books and blogs. The alphabet soup of technology acronyms is confusing me so I have a handful of questions.
First, what I have so far: I'm using JPA (via Hibernate) for persisting and retrieving data to a database. I made my data access objects EJBs and plan on deploying to an application server (JBoss) which makes transactions easier (they're at the function level of my DAOs) and I don't have to worry about getting a handle to an EntityManager (dependency injection). Here's an example of what things look like:
#Entity
class A{
#Id
Long id;
#OneToMany
List<B> setOfBs = new ArrayList<B>;
}
#Entity
class B{
#Id
Long id;
}
#Remote
public interface ADAO{
public A getAById(Long id);
}
#Stateless
class ADAOImpl implements ADAO{
#PersistenceContext
EntityManager em;
public A getAById(Long id){ ... }
}
My question: How should the Business Tier exchange data with the Integration Tier. I've read up on RESTful services, and they seem simple enough. My concern is performance when the frequency of gets and sets increases (HTTP communication doesn't seem particularly fast). Another option is RMI. My DAOs are already EJBs. Could I just have the Business Tier access them directly (via JNDI)? If so, what happens if the #OneToMany link in the example above are lazily loaded?
For example if the Business Tier does something like the following:
Context context = new InitialContext(propertiesForIntegrationTierLookup);
ADAOImpl aDao = (ADAOImpl) context.lookup("something");
A myA = aDao.getAById(0);
int numberOfBs = myA.setOfBs.size();
If the setOfBs list is loaded lazily, when the Business Tier (on a separate server) accesses the list, is the size correct? Does the list somehow get loaded correctly through the magic of EJBs? If not (which I expect), what's the solution?
Sorry for the long post. Like I said I'm new to J2EE and I've read enough to get the general idea, but I need help on fitting the pieces together.

When you call size() on lazy collection, it gets initialized, so you'll always get correct size no matter which interface you're using - Remote or Local.
Another situation is when you're trying to use JPA classes as data transfer objects (DTO) and request them via Remote interface. I don't remember any lazy initialization issues here, cause prior to transmission all objects have to be serialized (with lazy collections initialized) on server side. As a result, the whole object graph is passed over network, which might cause serious cpu and network overheads. In addition, for deserialization to be possible, you will have to share JPA classes with remote app. And that's where and how 'EJB magic' ends :)
So, once remote calls are possible, I'd suggest to start thinking of data transfer strategy and non-JPA data transfer objects as additional data layer. In my case, I've annotated DTO classes for XML binding (JAXB) and reused them in web-services.

Short answer: If you are using an "Integration Layer" approach, the things you should be integrating should be loosely coupled services, following SOA principles.
This means you should not be allowing remote calls to methods on entities that could be making calls to the framework under the lid on another server. If you do this, you are really building a tightly coupled distributed application and you will have to worry about the lazy loading problems and the scope of the persistence context. If you want that, you might like to consider extended persistence contexts http://docs.jboss.org/ejb3/docs/tutorial/extended_pc/extended.html.
You have talked about a "business tier", but JPA does not provide a business tier. It provides entities and allows CRUD operations, but these are typically not business operations. a "RegisterUser" operation is not simply a question of persisting a "User" entity. Your DAO layer may offer a higher level of operation, but DAOs are typically used to put a thin layer over the database, but it is still very data centric.
A better approach is to define business service type operations and make those the services that you expose. You might want another layer on top of your DAO or you might want to have one layer (convert your DAO layer).
You business layer should call flush and handle any JPA exceptions and hide all of that from the caller.
The issue of how to transfer your data remains. In many cases the parameters of your business service requests will be similar to your JPA entities, but I think you will notice that often there are sufficient differences that you want to define new DTOs. For example, a "RegisterUser" business operation might update both the "User" and "EmailAddresses" table. The User table might include a "createdDate" property which is not part of the "RegisterUser" operation, but is set to the current date.
For creating DTOs, you might like to look at Project Lombok.
To copy the DTO to the Entity, you can use Apache Commons BeanUtils (e.g., PropertyUtils.copyProperties) to do a lot of the leg work, which works if the property names are the same.
Personally, I don't see the point in XML in this case, unless you want to totally decouple your implementations.

Open Session In View Pattern

I'm asking this question given my chosen development frameworks of JPA (Hibernate implementation of), Spring, and <insert MVC framework here - Struts 1, Struts 2, Spring MVC, Stripes...>.
I've been thinking a bit about relationships in my entity layer - for example I have an order entity that has many order lines. I've set up my app so that it eagerly loads the order lines for every order. Do you think this is a lazy way to get around the lazy initialization problems that I would come across if I was to set the fetch strategy to false?
The way I see it, I have the following alternatives when retrieving entities and their associations:
Use the Open Session In View pattern to create the session on each request and commit the transaction before returning the response.
Implement a DTO (Data Transfer Object) layer such that every DAO query I execute returns the correctly initialized DTO for my purposes. I don't really like this option much because in my experience I've found that it creates a lot of boilerplate copying code and becomes messy to maintain.
Don't map any associations in JPA so that every query I execute returns only the entities I'm interested in - this will probably require me to have DTOs anyway and will be a pain to maintain and I think defeats the purpose of having an ORM in the first place.
Eagerly fetch all (or most associations) - in the example above, always fetch all order lines when I retrieve an order.
So my question is, when and under what circumstances would you use which of these options? Do you always stick with one way of doing it?
I would ask a colleague but I think that if I even mentioned the term 'Open Session in View' I would be greeted with blank stares :( What I'm really looking for here is some advice from a senior or very experienced developer.
Thanks guys!

Open Session in View has some problems.
For example, if the transaction fails, you might know it too late at commit time, once you are nearly done rendering your page (possibly the response already commited, so you can't change the page !) ... If you had know that error before, you would have followed a different flow and ended up rendering a different page...
Other example, reading data on-demand might turn to many "N+1 select" problems, that kill your performance.
Many projects use the following path:
Maintain transactions at the business layer ; load at that point everything you are supposed to need.
Presentation layer runs the risk of LazyExceptions : each is considered a programming error, caught during tests, and corrected by loading more data in the business layer (you have the opportunity to do it efficiently, avoiding "N+1 select" problems).
To avoid creating extra classes for DTOs, you can load the data inside the entity objects themselves. This is the whole point of the POJO approach (uses by modern data-access layers, and even integration technologies like Spring).

I've successfully solved all my lazy initialization problems with Open Session In View -pattern (ie. the Spring implementation). The technologies I used were the exact same as you have.
Using this pattern allows me to fully map the entity relationships and not worry about fetching child entities in the dao. Mostly. In 90% of the cases the pattern solves the lazy initialization needs in the view. In some cases you'll have to "manually" initialize relationships. These cases were rare and always involved very very complex mappings in my case.
When using Open Entity Manager In View pattern it's important to define the entity relationships and especially propagation and transactional settings correctly. If these are not configured properly, there will be errors related to closed sessions when some entity is lazily initialized in the view and it fails due to the session having been closed already.
I definately would go with option 1. Option 2 might be needed sometimes, but I see absolutely no reason to use option 3. Option 4 is also a no no. Eagerly fetching everything kills the performance of any view that needs to list just a few properties of some parent entities (orders in tis case).
N+1 Selects
During development there will be N+1 selects as a result of initializing some relationships in the view. But this is not a reason to discard the pattern. Just fix these problems as they arise and before delivering the code to production. It's as easy to fix these problems with OEMIV pattern as it's with any other pattern: add the proper dao or service methods, fix the controller to call a different finder method, maybe add a view to the database etc.

I have successfully used the Open-Session-in-View pattern on a project. However, I recently read in "Spring In Practice" of an interesting potential problem with non-repeatable reads if you manage your transactions at a lower layer while keeping the Hibernate session open in the view layer.
We managed most of our transactions in the service layer, but kept the hibernate session open in the view layer. This meant that lazy reads in the view were resulting in separate read transactions.
We managed our transactions in our service layer to minimize transaction duration. For instance, some of our service calls resulted in both a database transaction and a web service call to an external service. We did not want our transaction to be open while waiting for a web service call to respond.
As our system never went into production, I am not sure if there were any real problems with it, but I suspect that there was the potential for the view to attempt to lazily load an object that has been deleted by someone else.

There are some benefits of DTO approach though. You have to think beforehand what information you need. In some cases this will prevent you from generating n+1 select statements. It helps also to see where to use eager fetching and/or optimized views.

I'll also throw my weight behind the Open-Session-in-View pattern, having been in the exact same boat before.
I work with Stripes without spring, and have created a manual filter before that tends to work well. Coding transaction logic on the backend turns messy really quick as you've mentioned. Eagerly fetching everything becomes TERRIBLE as you map more and more objects to each other.
One thing I want to add that you may not have come across is Stripersist and Stripernate - Stripersist being the more JPA flavor - auto-hydration filters that take a lot of the work off your shoulders.
With Stripersist you can say things like /appContextRoot/actions/view/3 and it will auto-hydrate the JPA Entity on the ActionBean with id of 3 before the event is executed.
Stripersist is in the stripes-stuff package on sourceforge. I now use this for all new projects, as it's clean and easily supports multiple datasources if necessary.

Does the Order and Order Lines compose a high volume of data? Do they take part in online processes where real-time response is required? If so, you might consider not using eager fetching - it does make a huge diference in performance. If the amount of data is small, there is no problem in eager fetching.
About using DTOs, it might be a viable implementation.
If your business layer is used internally by your own application (i.e a small web app and its business logic) it'd probably be best to use your own entities in your view with open session in view pattern since it's simpler.
If your entities are used by many applications (i.e a backend application providing a service in your corporation) it'd be interesting to use DTOs since you would not expose your model to your clients. Exposing it could mean you would have a harder time refactoring your model since it could mean breaking contracts with your clients. A DTO would make that easier since you have another layer of
abstraction. This can be a bit strange since EJB3 would theorically eliminate the need of DTOs.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.