I have a Spring Boot application with a service that returns a Spring Data entity that is exposed to a controller. The problem is that I know it's not a good idea to use entities outside of DB transactions, so what would be the best practices?
Consider the following service:
#Transactional
public MyData getMyData(Long id) {
return myDataRepository.findById(id);
}
where MyData is a database #Entity and myDataRepository is a JpaRepository
This service method is called from a controller class, that sends this object in JSON format to a client that calls this method.
#RequestMapping("/")
public ResponseEntity<?> getMyData(#RequestParam Long id) {
return myService.getMyData(id);
}
If I expose MyData to a controller, then it will be exposed outside of a transaction and might cause all kind of hibernate errors. What are the best practices for these scenarios? Should I convert entity to POJO in side the service and return MyDataPOJO instead of MyData in MyService?
Using entities outside of transactions does not necessarily lead to problems; it may actually have valid use cases. However, there's quite a few variables at play and once you let them out of your sight things may and will go south. Consider the following scenarios:
Your entity doesn't have any relationships to other entities or those relationships are pretty shallow and eagerly fetched. You retrieve that entity from repository, detach it from persistence unit (implicitly or explicitly) and pass to controller. Controller does not attempt to modify the entity; it only serializes it into JSON - totally safe.
Same as above but controller modifies the entity before serializing it into JSON - again, totally safe (just don't expect those changes to be reflected in DB)
Same as above, but you've forgotten to detach the entity from PU - ouch, if controller changes the entity you may either see it reflected in DB or get transaction closed exception; both most likely being unintended consequences.
Same as above, but some of entity's relationships are lazy. Again, you may or may not get any exceptions depending on whether these lazy properties are being accessed or not.
And there are so many more combinations of intentional and unintentional design choices...
As you may see, things can get out of control very quickly. Especially so when your model has to evolve: before long you're going to find yourself fiddling with JSON views, #JsonIgnore, entity projections and so on. Thus the rule of thumb: although it may seem tempting to cut some corners and expose your entities to external layers, it's rarely a good idea. Properly designed solution always has a clear separation of concerns between layers:
Persistence layer never exposes more methods or entities than required by business logic. More over, the same table(s) can and should be mapped into several different entities depending on the use cases they participate in.
Business logic layer (btw this is your API, not the REST services! see below) never leaks any details from persistence layer. Its methods clearly define use cases from the problem domain.
Presentation layer only translates API provided by business logic into one or another form suitable for client and never implements additional use cases. Keep in mind that REST controllers, SOAP services etc logically are all part of presentation layer, not business logic.
So yeah, the short answer is: persistence entities should not be exposed to external layers. One common technique is to use DTOs instead; besides, DTO objects provide additional abstraction layer in case you need to change your entities but leave API intact or vice versa. If at some point your DTOs happen to closely resemble your entities, there are Java bean mapping frameworks like Dozer, Orika, MapStruct, JMapper, ModelMapper etc that help to eliminate the boilerplate code.
Try googling "hexagonal architecture". This is a very interesting concept for designing cleanly separated layers. Here's one of the articles on this subject https://blog.octo.com/en/hexagonal-architecture-three-principles-and-an-implementation-example/; it uses C# examples but they're pretty simple.
You should never leak the internal model to outside resources (in your case - the #RestController). The "POJO" you mentioned is typically called a DTO (Data Transfer Object). The DTO can be defined as an interface on the Service-side and implemented on the Controller-side. The Service would then - as you described - transform the internal model into an instance of the DTO, achieving looser coupling between the Controler and the Service.
By defining the DTO-interface on the service-side, you have the additional benefit that you can optimize your persistence-acces by only fetching the data specified in the corresponding DTO-interface. There is, for example, no need to fetch the friends of a User if the #Controller does not specifically requests them, thus you do not need to perform the additional JOIN in the database (provided you use a database).
Related
I have been wondering how this anomaly should be handled:
DTO's should be converted in the controller, the service layer does not need to know about them.
Transaction boundaries are defined by the service layer.
But how do you avoid a JPA LazyInitialization exception then? The DTO conversion might need Lazy Fetched data but is unable to as the transaction was handled by the service layer.
There are ways I can think of, but all of them are ugly. Putting the DTO conversion in the service layer seems the best to me now.
Yes, definitely it is better to manipulate DTOs in the service layer. This is especially true when updating entities with changes contained in DTOs, as otherwise you would need to get and update detached entities, pass them to service, merge them again into the persistence context, etc.
"DTO's should be converted in the controller, the service layer does not need to know about them."
Instead of this, I would say the better rule of thumb is that controllers do not need to know about entities. But you can use detached entities instead of DTOs for simple cases to avoid creating lots of small DTO classes, although I personally always use DTOs just to be consistent and to make later changes easier.
Raised 'LazyInitializationException' is just signal that some parts of data were not loaded, so the best solution will be to make multiple calls from controller method to service level and fetch all required fields for DTO.
Less elegant options are:
Its possible to detect fields that were not loaded via 'org.hibernate.Hibernate.isInitialized' method and skip them during DTO build, see here full sample:
How to test whether lazy loaded JPA collection is initialized?
You can mark controller method as transactional, there will be opened hibernate session after call to service level and so lazy loading will work.
DTOs are the model you should be working against from a layer above services. Only the service should know about the entity model. In simple degenerate cases the DTO model might look almost like the entity model which is why many people will just use the entity model. This works well until people get real requirements that will force them to change the way they use data. This is when the illusion that DTO = Entity falls apart.
A DTO is often a subset or a tranformation of the entity model. The point about the LazyInitializationException is a perfect example of when the illusion starts to crumble.
A service should return fully initialized DTOs i.e. not just some object that delegates to entity objects. There shouldn't be any lazy loading inovlved after a DTO was returned from a service. This means that you have to fetch exactly the state required for a DTO and wire that data into objects to be returned. Since that usually requires quite some boilerplate code and will sometimes result in having to duplicate logic, people tend to stick even longer to the DTO = Entity illusion by sprinkling some fetch joins here and there to make the LazyInitializationExceptions go away.
This is why I started the Blaze-Persistence Entity Views project which will give you the best of both worlds. Ease of use with less boilerplate, good performance and a safe model to avoid accidental errors. Maybe you want to give it a shot to see what it can do for you?
I am pretty new to Hibernate. I am having problem understanding these simple logics. I have understood that #Repository is used by Spring for accessing objects. Also, Hibernate uses #Entity to denote entities which are mapped into database tables. I was just wondering if a single class can be annotated with both #Repository and #Entity as they more or less imply the same.
NO.
Hibernate entities are managed by Hibernate ORM framework, they(and their proxies) are created by hibernate when you access them via get() or load(). They have a completely different (and complex) lifecycle than the Spring beans(they can be attached/detached/proxied/pending for removal)
Spring repositories are singletons, managed by Spring framework. Typically they exist as long as the container instance exists. New Hibernate sessions may be opened and closed, new user sessions engaged and then expired, but there still will be the same singleton instances of repositories.
Please see http://docs.jboss.org/hibernate/orm/3.3/reference/en/html/objectstate.html#objectstate-overview for the possible hibernate object states.
As for the repository instances - typically they are stateless, as they are services.
RE: they more or less imply the same. No they are not the same. There was an old joke
How many C++ programmers does it take to change a light bulb? You're
still thinking procedurally. A properly designed light bulb object
would inherit a change method from a generic light bulb class, so all
you would have to do is send a light-bulb-change message.
But good OOP programmers do not think that way, according to single responsibility principle objects should should have a single reason to change. Repository works with the infrastructure and has nothing to do with the business rules. Infrastructure may change(you may need for example to store you object in XML instead of RDBMS), but this should not affect the classes encapsulating the state of business objects.
You can possibly mitigate this problem by making a reference from the entity class to an abstract repository interface(implement an infamous Active Record pattern - it will be like referencing some abstract bulb socket from the bulb, this does not seem to be a good solution because bulb sockets and bulbs have different lifecycles).
That is where High Cohesion principle starts to play, according to which it's just illogical for an object, whose role is to reflect the abstractions from the model, to perform some completely unrelated things like persistence or transmitting over the network. It's weird when Student class will have print(), saveToXml() or transmitByHttp() methods.
They don't imply the same thing at all.
#Entity
An #Entity is something that represents a "thing" in your business domain. It could be anything - a Customer, an Elephant, a Product. . . It will have attributes that will get persisted to the database as well as methods relating to those attributes (at least it should, unless it's an anaemic entity, but that's an anti-pattern. . . . later, when you're comfortable with the basics check out Spring's #Configurable annotation - this allows you to provide collaborators to your entity).
#Repository
A #Repository, on the other hand provides an interface for retrieving and storing those entities.
There are some frameworks, especially in other languages, that combine the persistence and entity attributes on the same object, however this is not common in Java/Hibernate/Spring.
I have looked up a lot of information about the DAO pattern and I get the point of it. But I feel like most explainations aren't telling the whole story and by that I mean where would you actually use your DAO. So for example if I have a User class and a corresponding UserDAO that is able to save and restore users for me, which is the correct way:
The controller creates the User object and passes it to the UserDAO to save it to the database
The controller creates the User object and in its constructor the user object makes a call to the userDAO in order to save itself into the database
This is a code smell and you are missing an extra class "UserManager" which the controller will ask to create the user. The UserManager is responsible for creating the user and asking the UserDAO to save it
I really feel like the third option is the best, because all that the controller is responsible for is delegating the request to the correct model object.
What is your favorite way? Am I missing something here ?
From my experience with DAOs, the first approach is the only correct one. The reason is that it has the clearest responsibilities and produces the least clutter (well, some very respectable programmers regard DAOs themselves as clutter. Adam Bien sees the original DAO pattern already implemented in the EntityManager and further DAOs to be mostly unnecessary "pipes")
Approach 2 binds the model to the DAO, creating an "upstream dependency". What I mean is that usually the models are distributed as separate packages and are (and should be) ignorant of the details of their persistence. A similar pattern to what you are describing is the Active Record pattern. It is widely used in Ruby on Rails but has not been implemented with equal elegance and simplicity in Java.
Approach 3 - what is supposed to be the point of the UserManager? In your example the Manager performs 2 tasks - it has the duties of a User factory and is a proxy for persistence requests. If it is a factory and you need one, you should name it UserFactory without imposing additional tasks on it. As for the proxy - why should you need it?
IMHO most classes named ...Manager have a smell. The name itself suggests that the class has no clear purpose. Whenever I have an urge to name a class ...Manager, it's a signal for me to find a better fitting name or to think hard about my architecture.
For the first approach; IMHO, controller calling a method on a DAO object is not a good design. Controllers must be asking "service" level objects about business. How these "services" persist the data is not a concern for the controller.
For the second approach; sometimes you may want to just create the object, so constructor duty and persisting duty must not be tightly coupled like this.
Lastly, the manager or the service objects is a good abstraction for the layered architecture. This way you can group the business flows in the appropriate classes and methods.
But for Play, companion objects of case classes are also a good candidate to use as DAO. The singleton nature of these objects make it a good candidate.
case class TicketResponse(appId: String, ticket: String, ts: String)
object TicketResponse{
implicit val ticketWrites = Json.writes[TicketResponse]
def save(response: TicketResponse) = {
val result = DB.withConnection {
implicit connection =>
SQL("insert into tickets(ticket, appid, ts)"
+ " values ({ticket},{appid},{ts})")
.on('ticket -> response.ticket, 'appid -> response.appId, 'ts -> response.ts).executeInsert()
}
}
}
The Data Access Object (DAO) should be used closer to the data access layer of your application.
The data access object actually does the data access activities. So it is part of data access layer.
The architecture layers before DAO could vary in projects.
Controllers are basically for controlling the request flow. So they are kind of close to UI.
Although, a Manager, Handler is a bad idea, we could still add a layer between controller and DAO. So controller will pre-process the data that is coming from a request or going out (data sanity, security, localization, i18n, transform to JSON, etc). It sends data to service in the form of domain objects (User in this case). The service will invoke some business logic on this user or use it for some business logic. And it would then pass it to DAO.
Having the business logic in controller layer is not good if you are supporting multiple clients like JSPs, WebServices, handheld devices, etc.
Assuming Controller means the "C" in MVC, your third option is the right approach. Generally speaking Controller code extends or follows the conventions of a framework. One of the ideals of MVC is swapping frameworks, which is really the Controller, should be relatively easy. Controllers should just move data back and forth between the model and view layers.
From a model perspective, Controllers should interact with a service layer - a contextual boundary - in sitting front of the domain model. The UserManager object would be an example of a piece that you would consider part of your service layer - that is the domain model's public API.
for typical webapp i will prefer play framework with play's JPA and database implementation. It much more productive way.
please take a look here http://www.playframework.org/documentation/1.2.5/jpa
and here
http://www.playframework.org/documentation/1.2.5/guide1 and http://www.playframework.org/documentation/1.2.5/guide2
That's it))
With the introduction of Hibernate in my project, my code started getting really coupled, and boilerplate in many places (and it should be the other way round, right?)
I got pretty confused by a particular example. I've always considered DAO objects to be pretty generic in their nature (mostly encapsulating the basic CRUD oeprations as well as the backend storage implementation)
Unfortunately, as my entity classes started to get more complicated, I started offloading more and more logic to the DAO objects. I have a particular example:
my entity class User should have a relation called friends, which is essentially a collection of users. However, I have to map my class to a collection of UserFriendship objects instead, each of which contains a ref to the friend object, but also other specific friendship data (the date when the friendship occurred)
Now, it is easy to introduce a custom getter in the entity class, which will take the collection of UserFriendship objects and turn it into a collection of User objects instead. However, what if I need only a subset of my friends collection, say, like in paging. I cannot really do that in the entity object, because it doesn't have access to the session, right? This also applies to when I need to make a parametrized query on the relationship. The one that has the access to the session is the UserDAO. So I ended up with this
UserDAO
=> normal CRUD methods
=> getFriends(Integer offset, Integer limit);
=> a bunch of similar getters and setters responsible for managing the relationships within the User instance.
This is insane. But I cannot really do anything else. I am not aware if it is possible to declare computed properties within the entity classes, which could also be parametrized.
I could technically also wrap the DAO within the entity, and put the helper getters and setters back into the entity class, where they should be, but I am not sure whether if that is a good practice as well.
I know that the DAO should only be accessed by the controller object, and it should provide a more or less complete entity object or a set of entity objects.
I am deeply confused. More or less all of my DAO objects now couple logic that should be either in the Entity objects or in the controllers.
I am sorry if my question is a bit confusing. It is a bit hard to formulate it.
My general rules are:
in the entity classes, respect the law of Demeter: don't talk to strangers
the entity classes must not use the session
the controller/service classes must not use the session. They may navigate in the graph of entities and call DAO methods
DAO methods should be the ones using the session. Their work consists in getting, saving, merging entities and executing queries. If several queries or persistence-related actions should be executed for a single use-case, the controller/service should coordinate them, not the DAO.
This way, I can test the business logic relatively easily by mocking the DAOs, and I can test the DAOs relatively easily because they don't contain much logic. Most of the tests verify that the queries find what they're supposed to find, return them in the appropriate order, and initialize the associations that must be initialized (to avoid lazy loading exceptions in the presentation layer, where I'm using detached objects)
I need some advice on designing an "Integration Tier" of an N-Tiered system in Java. This tier is responsible for persisting and retrieving data for the "Business Tier" (located on a separate server). I'm new to J2EE and I've read a few books and blogs. The alphabet soup of technology acronyms is confusing me so I have a handful of questions.
First, what I have so far: I'm using JPA (via Hibernate) for persisting and retrieving data to a database. I made my data access objects EJBs and plan on deploying to an application server (JBoss) which makes transactions easier (they're at the function level of my DAOs) and I don't have to worry about getting a handle to an EntityManager (dependency injection). Here's an example of what things look like:
#Entity
class A{
#Id
Long id;
#OneToMany
List<B> setOfBs = new ArrayList<B>;
}
#Entity
class B{
#Id
Long id;
}
#Remote
public interface ADAO{
public A getAById(Long id);
}
#Stateless
class ADAOImpl implements ADAO{
#PersistenceContext
EntityManager em;
public A getAById(Long id){ ... }
}
My question: How should the Business Tier exchange data with the Integration Tier. I've read up on RESTful services, and they seem simple enough. My concern is performance when the frequency of gets and sets increases (HTTP communication doesn't seem particularly fast). Another option is RMI. My DAOs are already EJBs. Could I just have the Business Tier access them directly (via JNDI)? If so, what happens if the #OneToMany link in the example above are lazily loaded?
For example if the Business Tier does something like the following:
Context context = new InitialContext(propertiesForIntegrationTierLookup);
ADAOImpl aDao = (ADAOImpl) context.lookup("something");
A myA = aDao.getAById(0);
int numberOfBs = myA.setOfBs.size();
If the setOfBs list is loaded lazily, when the Business Tier (on a separate server) accesses the list, is the size correct? Does the list somehow get loaded correctly through the magic of EJBs? If not (which I expect), what's the solution?
Sorry for the long post. Like I said I'm new to J2EE and I've read enough to get the general idea, but I need help on fitting the pieces together.
When you call size() on lazy collection, it gets initialized, so you'll always get correct size no matter which interface you're using - Remote or Local.
Another situation is when you're trying to use JPA classes as data transfer objects (DTO) and request them via Remote interface. I don't remember any lazy initialization issues here, cause prior to transmission all objects have to be serialized (with lazy collections initialized) on server side. As a result, the whole object graph is passed over network, which might cause serious cpu and network overheads. In addition, for deserialization to be possible, you will have to share JPA classes with remote app. And that's where and how 'EJB magic' ends :)
So, once remote calls are possible, I'd suggest to start thinking of data transfer strategy and non-JPA data transfer objects as additional data layer. In my case, I've annotated DTO classes for XML binding (JAXB) and reused them in web-services.
Short answer: If you are using an "Integration Layer" approach, the things you should be integrating should be loosely coupled services, following SOA principles.
This means you should not be allowing remote calls to methods on entities that could be making calls to the framework under the lid on another server. If you do this, you are really building a tightly coupled distributed application and you will have to worry about the lazy loading problems and the scope of the persistence context. If you want that, you might like to consider extended persistence contexts http://docs.jboss.org/ejb3/docs/tutorial/extended_pc/extended.html.
You have talked about a "business tier", but JPA does not provide a business tier. It provides entities and allows CRUD operations, but these are typically not business operations. a "RegisterUser" operation is not simply a question of persisting a "User" entity. Your DAO layer may offer a higher level of operation, but DAOs are typically used to put a thin layer over the database, but it is still very data centric.
A better approach is to define business service type operations and make those the services that you expose. You might want another layer on top of your DAO or you might want to have one layer (convert your DAO layer).
You business layer should call flush and handle any JPA exceptions and hide all of that from the caller.
The issue of how to transfer your data remains. In many cases the parameters of your business service requests will be similar to your JPA entities, but I think you will notice that often there are sufficient differences that you want to define new DTOs. For example, a "RegisterUser" business operation might update both the "User" and "EmailAddresses" table. The User table might include a "createdDate" property which is not part of the "RegisterUser" operation, but is set to the current date.
For creating DTOs, you might like to look at Project Lombok.
To copy the DTO to the Entity, you can use Apache Commons BeanUtils (e.g., PropertyUtils.copyProperties) to do a lot of the leg work, which works if the property names are the same.
Personally, I don't see the point in XML in this case, unless you want to totally decouple your implementations.