controllers, entity classes or dao - what goes where?

controllers, entity classes or dao - what goes where? - java

With the introduction of Hibernate in my project, my code started getting really coupled, and boilerplate in many places (and it should be the other way round, right?)
I got pretty confused by a particular example. I've always considered DAO objects to be pretty generic in their nature (mostly encapsulating the basic CRUD oeprations as well as the backend storage implementation)
Unfortunately, as my entity classes started to get more complicated, I started offloading more and more logic to the DAO objects. I have a particular example:
my entity class User should have a relation called friends, which is essentially a collection of users. However, I have to map my class to a collection of UserFriendship objects instead, each of which contains a ref to the friend object, but also other specific friendship data (the date when the friendship occurred)
Now, it is easy to introduce a custom getter in the entity class, which will take the collection of UserFriendship objects and turn it into a collection of User objects instead. However, what if I need only a subset of my friends collection, say, like in paging. I cannot really do that in the entity object, because it doesn't have access to the session, right? This also applies to when I need to make a parametrized query on the relationship. The one that has the access to the session is the UserDAO. So I ended up with this
UserDAO
=> normal CRUD methods
=> getFriends(Integer offset, Integer limit);
=> a bunch of similar getters and setters responsible for managing the relationships within the User instance.
This is insane. But I cannot really do anything else. I am not aware if it is possible to declare computed properties within the entity classes, which could also be parametrized.
I could technically also wrap the DAO within the entity, and put the helper getters and setters back into the entity class, where they should be, but I am not sure whether if that is a good practice as well.
I know that the DAO should only be accessed by the controller object, and it should provide a more or less complete entity object or a set of entity objects.
I am deeply confused. More or less all of my DAO objects now couple logic that should be either in the Entity objects or in the controllers.
I am sorry if my question is a bit confusing. It is a bit hard to formulate it.

My general rules are:
in the entity classes, respect the law of Demeter: don't talk to strangers
the entity classes must not use the session
the controller/service classes must not use the session. They may navigate in the graph of entities and call DAO methods
DAO methods should be the ones using the session. Their work consists in getting, saving, merging entities and executing queries. If several queries or persistence-related actions should be executed for a single use-case, the controller/service should coordinate them, not the DAO.
This way, I can test the business logic relatively easily by mocking the DAOs, and I can test the DAOs relatively easily because they don't contain much logic. Most of the tests verify that the queries find what they're supposed to find, return them in the appropriate order, and initialize the associations that must be initialized (to avoid lazy loading exceptions in the presentation layer, where I'm using detached objects)

Related

JPA Spring Data entity to be used outside of transaction

I have a Spring Boot application with a service that returns a Spring Data entity that is exposed to a controller. The problem is that I know it's not a good idea to use entities outside of DB transactions, so what would be the best practices?
Consider the following service:
#Transactional
public MyData getMyData(Long id) {
return myDataRepository.findById(id);
}
where MyData is a database #Entity and myDataRepository is a JpaRepository
This service method is called from a controller class, that sends this object in JSON format to a client that calls this method.
#RequestMapping("/")
public ResponseEntity<?> getMyData(#RequestParam Long id) {
return myService.getMyData(id);
}
If I expose MyData to a controller, then it will be exposed outside of a transaction and might cause all kind of hibernate errors. What are the best practices for these scenarios? Should I convert entity to POJO in side the service and return MyDataPOJO instead of MyData in MyService?

Using entities outside of transactions does not necessarily lead to problems; it may actually have valid use cases. However, there's quite a few variables at play and once you let them out of your sight things may and will go south. Consider the following scenarios:
Your entity doesn't have any relationships to other entities or those relationships are pretty shallow and eagerly fetched. You retrieve that entity from repository, detach it from persistence unit (implicitly or explicitly) and pass to controller. Controller does not attempt to modify the entity; it only serializes it into JSON - totally safe.
Same as above but controller modifies the entity before serializing it into JSON - again, totally safe (just don't expect those changes to be reflected in DB)
Same as above, but you've forgotten to detach the entity from PU - ouch, if controller changes the entity you may either see it reflected in DB or get transaction closed exception; both most likely being unintended consequences.
Same as above, but some of entity's relationships are lazy. Again, you may or may not get any exceptions depending on whether these lazy properties are being accessed or not.
And there are so many more combinations of intentional and unintentional design choices...
As you may see, things can get out of control very quickly. Especially so when your model has to evolve: before long you're going to find yourself fiddling with JSON views, #JsonIgnore, entity projections and so on. Thus the rule of thumb: although it may seem tempting to cut some corners and expose your entities to external layers, it's rarely a good idea. Properly designed solution always has a clear separation of concerns between layers:
Persistence layer never exposes more methods or entities than required by business logic. More over, the same table(s) can and should be mapped into several different entities depending on the use cases they participate in.
Business logic layer (btw this is your API, not the REST services! see below) never leaks any details from persistence layer. Its methods clearly define use cases from the problem domain.
Presentation layer only translates API provided by business logic into one or another form suitable for client and never implements additional use cases. Keep in mind that REST controllers, SOAP services etc logically are all part of presentation layer, not business logic.
So yeah, the short answer is: persistence entities should not be exposed to external layers. One common technique is to use DTOs instead; besides, DTO objects provide additional abstraction layer in case you need to change your entities but leave API intact or vice versa. If at some point your DTOs happen to closely resemble your entities, there are Java bean mapping frameworks like Dozer, Orika, MapStruct, JMapper, ModelMapper etc that help to eliminate the boilerplate code.
Try googling "hexagonal architecture". This is a very interesting concept for designing cleanly separated layers. Here's one of the articles on this subject https://blog.octo.com/en/hexagonal-architecture-three-principles-and-an-implementation-example/; it uses C# examples but they're pretty simple.

You should never leak the internal model to outside resources (in your case - the #RestController). The "POJO" you mentioned is typically called a DTO (Data Transfer Object). The DTO can be defined as an interface on the Service-side and implemented on the Controller-side. The Service would then - as you described - transform the internal model into an instance of the DTO, achieving looser coupling between the Controler and the Service.
By defining the DTO-interface on the service-side, you have the additional benefit that you can optimize your persistence-acces by only fetching the data specified in the corresponding DTO-interface. There is, for example, no need to fetch the friends of a User if the #Controller does not specifically requests them, thus you do not need to perform the additional JOIN in the database (provided you use a database).

Transaction Boundary and DTO conversion with JPA

I have been wondering how this anomaly should be handled:
DTO's should be converted in the controller, the service layer does not need to know about them.
Transaction boundaries are defined by the service layer.
But how do you avoid a JPA LazyInitialization exception then? The DTO conversion might need Lazy Fetched data but is unable to as the transaction was handled by the service layer.
There are ways I can think of, but all of them are ugly. Putting the DTO conversion in the service layer seems the best to me now.

Yes, definitely it is better to manipulate DTOs in the service layer. This is especially true when updating entities with changes contained in DTOs, as otherwise you would need to get and update detached entities, pass them to service, merge them again into the persistence context, etc.
"DTO's should be converted in the controller, the service layer does not need to know about them."
Instead of this, I would say the better rule of thumb is that controllers do not need to know about entities. But you can use detached entities instead of DTOs for simple cases to avoid creating lots of small DTO classes, although I personally always use DTOs just to be consistent and to make later changes easier.

Raised 'LazyInitializationException' is just signal that some parts of data were not loaded, so the best solution will be to make multiple calls from controller method to service level and fetch all required fields for DTO.
Less elegant options are:
Its possible to detect fields that were not loaded via 'org.hibernate.Hibernate.isInitialized' method and skip them during DTO build, see here full sample:
How to test whether lazy loaded JPA collection is initialized?
You can mark controller method as transactional, there will be opened hibernate session after call to service level and so lazy loading will work.

DTOs are the model you should be working against from a layer above services. Only the service should know about the entity model. In simple degenerate cases the DTO model might look almost like the entity model which is why many people will just use the entity model. This works well until people get real requirements that will force them to change the way they use data. This is when the illusion that DTO = Entity falls apart.
A DTO is often a subset or a tranformation of the entity model. The point about the LazyInitializationException is a perfect example of when the illusion starts to crumble.
A service should return fully initialized DTOs i.e. not just some object that delegates to entity objects. There shouldn't be any lazy loading inovlved after a DTO was returned from a service. This means that you have to fetch exactly the state required for a DTO and wire that data into objects to be returned. Since that usually requires quite some boilerplate code and will sometimes result in having to duplicate logic, people tend to stick even longer to the DTO = Entity illusion by sprinkling some fetch joins here and there to make the LazyInitializationExceptions go away.
This is why I started the Blaze-Persistence Entity Views project which will give you the best of both worlds. Ease of use with less boilerplate, good performance and a safe model to avoid accidental errors. Maybe you want to give it a shot to see what it can do for you?

Violation of DAO pattern. What to do?

I need to generate and then perform a complex sql-query which is going to access multiple databases to create some general report. This implies that the query's not related to a specific DAO object.
So where should I put the logic of executing such a query and returning result as DTO? If I create ReportDao interface and then implement it it may lead another developer into troubles, beucasu I think they will expect the Dao object tied with some table in the database.

!Opinion warning!
A DAO does not necessarily have to be linked to a specific domain class. No domain class lives in isolation, and if one presumes a DAO to only include operations on one table/domain class, one is in for a surprise, since operations might pertain to multiple domain classes, and thus be wrongly placed no matter where you put it. It's better to also think of a DAO as a collection of methods pertaining to a certain area of functionality. If most Dao are modeled around domain objects it might be wise to name the different one a bit differently, but ReportDao should be fine as long as we're talking about a collection of methods pertaining to reports/reporting. Or maybe "GeneralReportDataDao" is better (keep in mind that I only have the information in your question to work with, think about what the class represents and try to find a descriptive name..)
Another point I have seen from experience when organizing DAOs after domain classes, is that the DAOs pertaining to central domain classes tends to grow very large, since central domain classes are often linked to large amounts of functionality. This is not only true for DAO-classes, but also for Services, etc, using the same pattern for organizing functionality.
We mainly have two "types" of classes in Java, we have classes that represent something (classes containing data, typically stateful classes), and classes that do something (service, dao, etc, typically stateless classes). The stateful data classes should be named and modeled after what they represent, i.e. the data, while the stateless service classes should be named and modeled after functionality. While it is tempting to try to organize services the same way as data, it often leads to poor code, with large classes and areas of functionality spread across several classes.

Can a class be annotated as both #Repository and #Entity?

I am pretty new to Hibernate. I am having problem understanding these simple logics. I have understood that #Repository is used by Spring for accessing objects. Also, Hibernate uses #Entity to denote entities which are mapped into database tables. I was just wondering if a single class can be annotated with both #Repository and #Entity as they more or less imply the same.

NO.
Hibernate entities are managed by Hibernate ORM framework, they(and their proxies) are created by hibernate when you access them via get() or load(). They have a completely different (and complex) lifecycle than the Spring beans(they can be attached/detached/proxied/pending for removal)
Spring repositories are singletons, managed by Spring framework. Typically they exist as long as the container instance exists. New Hibernate sessions may be opened and closed, new user sessions engaged and then expired, but there still will be the same singleton instances of repositories.
Please see http://docs.jboss.org/hibernate/orm/3.3/reference/en/html/objectstate.html#objectstate-overview for the possible hibernate object states.
As for the repository instances - typically they are stateless, as they are services.
RE: they more or less imply the same. No they are not the same. There was an old joke
How many C++ programmers does it take to change a light bulb? You're
still thinking procedurally. A properly designed light bulb object
would inherit a change method from a generic light bulb class, so all
you would have to do is send a light-bulb-change message.
But good OOP programmers do not think that way, according to single responsibility principle objects should should have a single reason to change. Repository works with the infrastructure and has nothing to do with the business rules. Infrastructure may change(you may need for example to store you object in XML instead of RDBMS), but this should not affect the classes encapsulating the state of business objects.
You can possibly mitigate this problem by making a reference from the entity class to an abstract repository interface(implement an infamous Active Record pattern - it will be like referencing some abstract bulb socket from the bulb, this does not seem to be a good solution because bulb sockets and bulbs have different lifecycles).
That is where High Cohesion principle starts to play, according to which it's just illogical for an object, whose role is to reflect the abstractions from the model, to perform some completely unrelated things like persistence or transmitting over the network. It's weird when Student class will have print(), saveToXml() or transmitByHttp() methods.

They don't imply the same thing at all.
#Entity
An #Entity is something that represents a "thing" in your business domain. It could be anything - a Customer, an Elephant, a Product. . . It will have attributes that will get persisted to the database as well as methods relating to those attributes (at least it should, unless it's an anaemic entity, but that's an anti-pattern. . . . later, when you're comfortable with the basics check out Spring's #Configurable annotation - this allows you to provide collaborators to your entity).
#Repository
A #Repository, on the other hand provides an interface for retrieving and storing those entities.
There are some frameworks, especially in other languages, that combine the persistence and entity attributes on the same object, however this is not common in Java/Hibernate/Spring.

How to persist every new entity?

I expect every instantiated entity to correspond to a tuple (& co) in the database. In the examples I see around, one always instantiates the entity (via a constructor) and then calls persist with that entity. I find this error-prone, and was wondering if it wasn't possible to have every instantiated entity automatically managed/persisted/reflected to the database (at least intended to).
This also seems to prevent me from persisting instance variable entities. I.e. I've an entity which instantiates another (entities it has an association with) in its constructor.

That's just a practice. The model shouldn't be aware of any DAO/persistence logic. If it does, then it is tight coupled and not reuseable for another persistence frameworks. However, if you are sure that you stick to JPA for ages, then you may consider to do so. But this is generally just not a good practice. The model may not be reuseable in another layers then. For example, you may want create a mock/dummy model object for the view layer to let a new user fill in the registration details and then only persist it when submission and validation is done succesfully.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.