I am pretty new to Hibernate. I am having problem understanding these simple logics. I have understood that #Repository is used by Spring for accessing objects. Also, Hibernate uses #Entity to denote entities which are mapped into database tables. I was just wondering if a single class can be annotated with both #Repository and #Entity as they more or less imply the same.
NO.
Hibernate entities are managed by Hibernate ORM framework, they(and their proxies) are created by hibernate when you access them via get() or load(). They have a completely different (and complex) lifecycle than the Spring beans(they can be attached/detached/proxied/pending for removal)
Spring repositories are singletons, managed by Spring framework. Typically they exist as long as the container instance exists. New Hibernate sessions may be opened and closed, new user sessions engaged and then expired, but there still will be the same singleton instances of repositories.
Please see http://docs.jboss.org/hibernate/orm/3.3/reference/en/html/objectstate.html#objectstate-overview for the possible hibernate object states.
As for the repository instances - typically they are stateless, as they are services.
RE: they more or less imply the same. No they are not the same. There was an old joke
How many C++ programmers does it take to change a light bulb? You're
still thinking procedurally. A properly designed light bulb object
would inherit a change method from a generic light bulb class, so all
you would have to do is send a light-bulb-change message.
But good OOP programmers do not think that way, according to single responsibility principle objects should should have a single reason to change. Repository works with the infrastructure and has nothing to do with the business rules. Infrastructure may change(you may need for example to store you object in XML instead of RDBMS), but this should not affect the classes encapsulating the state of business objects.
You can possibly mitigate this problem by making a reference from the entity class to an abstract repository interface(implement an infamous Active Record pattern - it will be like referencing some abstract bulb socket from the bulb, this does not seem to be a good solution because bulb sockets and bulbs have different lifecycles).
That is where High Cohesion principle starts to play, according to which it's just illogical for an object, whose role is to reflect the abstractions from the model, to perform some completely unrelated things like persistence or transmitting over the network. It's weird when Student class will have print(), saveToXml() or transmitByHttp() methods.
They don't imply the same thing at all.
#Entity
An #Entity is something that represents a "thing" in your business domain. It could be anything - a Customer, an Elephant, a Product. . . It will have attributes that will get persisted to the database as well as methods relating to those attributes (at least it should, unless it's an anaemic entity, but that's an anti-pattern. . . . later, when you're comfortable with the basics check out Spring's #Configurable annotation - this allows you to provide collaborators to your entity).
#Repository
A #Repository, on the other hand provides an interface for retrieving and storing those entities.
There are some frameworks, especially in other languages, that combine the persistence and entity attributes on the same object, however this is not common in Java/Hibernate/Spring.
Related
I have a Spring Boot application with a service that returns a Spring Data entity that is exposed to a controller. The problem is that I know it's not a good idea to use entities outside of DB transactions, so what would be the best practices?
Consider the following service:
#Transactional
public MyData getMyData(Long id) {
return myDataRepository.findById(id);
}
where MyData is a database #Entity and myDataRepository is a JpaRepository
This service method is called from a controller class, that sends this object in JSON format to a client that calls this method.
#RequestMapping("/")
public ResponseEntity<?> getMyData(#RequestParam Long id) {
return myService.getMyData(id);
}
If I expose MyData to a controller, then it will be exposed outside of a transaction and might cause all kind of hibernate errors. What are the best practices for these scenarios? Should I convert entity to POJO in side the service and return MyDataPOJO instead of MyData in MyService?
Using entities outside of transactions does not necessarily lead to problems; it may actually have valid use cases. However, there's quite a few variables at play and once you let them out of your sight things may and will go south. Consider the following scenarios:
Your entity doesn't have any relationships to other entities or those relationships are pretty shallow and eagerly fetched. You retrieve that entity from repository, detach it from persistence unit (implicitly or explicitly) and pass to controller. Controller does not attempt to modify the entity; it only serializes it into JSON - totally safe.
Same as above but controller modifies the entity before serializing it into JSON - again, totally safe (just don't expect those changes to be reflected in DB)
Same as above, but you've forgotten to detach the entity from PU - ouch, if controller changes the entity you may either see it reflected in DB or get transaction closed exception; both most likely being unintended consequences.
Same as above, but some of entity's relationships are lazy. Again, you may or may not get any exceptions depending on whether these lazy properties are being accessed or not.
And there are so many more combinations of intentional and unintentional design choices...
As you may see, things can get out of control very quickly. Especially so when your model has to evolve: before long you're going to find yourself fiddling with JSON views, #JsonIgnore, entity projections and so on. Thus the rule of thumb: although it may seem tempting to cut some corners and expose your entities to external layers, it's rarely a good idea. Properly designed solution always has a clear separation of concerns between layers:
Persistence layer never exposes more methods or entities than required by business logic. More over, the same table(s) can and should be mapped into several different entities depending on the use cases they participate in.
Business logic layer (btw this is your API, not the REST services! see below) never leaks any details from persistence layer. Its methods clearly define use cases from the problem domain.
Presentation layer only translates API provided by business logic into one or another form suitable for client and never implements additional use cases. Keep in mind that REST controllers, SOAP services etc logically are all part of presentation layer, not business logic.
So yeah, the short answer is: persistence entities should not be exposed to external layers. One common technique is to use DTOs instead; besides, DTO objects provide additional abstraction layer in case you need to change your entities but leave API intact or vice versa. If at some point your DTOs happen to closely resemble your entities, there are Java bean mapping frameworks like Dozer, Orika, MapStruct, JMapper, ModelMapper etc that help to eliminate the boilerplate code.
Try googling "hexagonal architecture". This is a very interesting concept for designing cleanly separated layers. Here's one of the articles on this subject https://blog.octo.com/en/hexagonal-architecture-three-principles-and-an-implementation-example/; it uses C# examples but they're pretty simple.
You should never leak the internal model to outside resources (in your case - the #RestController). The "POJO" you mentioned is typically called a DTO (Data Transfer Object). The DTO can be defined as an interface on the Service-side and implemented on the Controller-side. The Service would then - as you described - transform the internal model into an instance of the DTO, achieving looser coupling between the Controler and the Service.
By defining the DTO-interface on the service-side, you have the additional benefit that you can optimize your persistence-acces by only fetching the data specified in the corresponding DTO-interface. There is, for example, no need to fetch the friends of a User if the #Controller does not specifically requests them, thus you do not need to perform the additional JOIN in the database (provided you use a database).
I'm (trying to :) using spring-boot-starter-data-rest in my spring boot app to quickly serve the model through true, fullblown, restFULL api. That works great.
Question 1 (Security):
The advantage of Spring JpaRepository is I don't need to code basic functions (save, findAll, etc). Is it possible to secure these auto-implemented methods without overriding all of them (wasting what Spring provided for me)? i.e.:
public interface BookRepository extends JpaRepository<Book, Long> {
#PreAuthorize("hasRole('ROLE_ADMIN')")
<S extends Book> Book save(Book book);
}
.
Question 2 (Security):
How to secure a JpaRepository to prevent updating items the loggeg-in user is not an owner?
i.e.: User is allowed to modify only his/her own properties.
i.e.2: User is allowed to modify/delete only the Posts he/she created.
Sample code is highly welcome here.
.
Question 3 (DTOs):
Some time ago I had an argue with a developer friend: He ensisted that there MUST be DTOs returned from Spring MVC controllers. Even if the DTO is 1-1 copy of the model object. Then I reserched, asked other guys and confirmed it: DTOs are required to divide/segregate the application layers.
How this relates to JpaRepositories? How to use DTOs with Spring auto serverd rest repos? Should I DTOs at all?
Thanks for your hints/answers in advance !
Question 1: Security
Some old docs mention:
[...] you expose a pre-defined set of operations to clients that are not under you control, it’s pretty much all or nothing until now. There’s seemingly no way to only expose read operations while hiding state changing operations entirely.
which implies that all methods are automatically inherited (also, as per standard java inheritance behavior).
As per the #PreAuhtorize docs, you can place the annotation also on a class / interface declaration.
So you could just have one basic interface extend JpaRepository
#NoRepositoryBean // tell Spring not create instances of this one
#PreAuthorize("hasRole('ROLE_ADMIN')") // all methods will inherit this behavior
interface BaseRepository<T, ID extends Serializable> extends Repository<T, ID> {}
and then have all of your Repository's extend BaseRepository.
Question 2: Security
I'm going to be a little more general on this one.
In order to correctly regulate access to entities within your application and define what-can-see-what, you should always separate your project into different layers.
A good starting point would be:
layer-web (or presentation-layer): access to layer-business, no access to the db-layer. Can see DTO models but not DB models
layer-business (or business-layer): access to the db-layer but no access to the DAO
layer-db (or data-layer): convert DTO -> DB model. Persist objects and provide query results
In your case, I believe that the right thing to do, would be therefore to check the role in the layer-business, before the request even reaches the Repository class.
#Service
public interface BookService {
#PreAuthorize("hasRole('ROLE_ADMIN')")
ActionResult saveToDatabase(final BookDTO book);
}
or, as seen before
#Service
#PreAuthorize("hasRole('ROLE_ADMIN')")
public interface BookService {
ActionResult saveToDatabase(final BookDTO book);
}
Also, ensuring a user can modify only its own objects can be done in many ways.
Spring provides all necessary resources to do that, as this answer points out.
Or, if you are familiar with AOP you can implement your own logic.
E.g (dummyCode):
#Service
public interface BookService {
// custom annotation here
#RequireUserOwnership(allowAdmin = false)
ActionResult saveToDatabase(final BookDTO book);
}
And the check:
public class EnsureUserOwnershipInterceptor implements MethodInterceptor {
#Autowired
private AuthenticationService authenticationService;
#Override
public Object invoke(Invocation invocation) throws Throwable {
// 1. get the BookDTO argument from the invocation
// 2. get the current user from the auth service
// 3. ensure the owner ID and the current user ID match
// ...
}
}
Useful resources about AOP can be found here and here.
Question 3: DTO's and DB models
Should I DTOs at all?
Yes, yes you should. Even if your projects has only a few models and your are just programming for fun (deploying only on localhost, learning, ...).
The sooner you get into the habit of separating your models, the better it is.
Also, conceptually, one is an object coming from an unknown source, the other represents a table in your database.
How this relates to JpaRepositories?
How to use DTOs with Spring auto serverd rest repos?
Now that's the point! You can't put DTO's into #Repositorys. You are forced to convert one to another. At the same point you are also forced to verify that the conversion is valid.
You are basically ensuring that DTOs (dirty data) will not touch the database in any way, and you are placing a wall made of logical constraints between the database and the rest of the application.
Also I am aware of Spring integrating well with model-conversion frameworks.
So, what are the advantages of a multi-layer / modular web-application?
Applications can grow very quickly. Especially when you have many developers working on it. Some developers tend to look for the quickest solution and implement dirty tricks or change access modifiers to finish the job asap. You should force people to gain access to certain resources only through some explicitly defined channels.
The more rules you set from the beginning, the longer the correct programming pattern will be followed. I have seen banking application become a complete mess after less then a year. When a hotfix was required, changing some code would create two-three other bugs.
You may reach a point where the application is consuming too many OS resources. If you, let's say, have a module module-batch containing background-jobs for your application, it will be way easier to extract it and implement it into another application. If your module contains logic that queries the database, access any type of data, provides API for the front-end, ecc... you will be basically forced to export all your code into your new application. Refactoring will be a pain in the neck at that point.
Imagine you want to hire some database experts to analyze the queries your application does. With a well-defined and separated logic you can give them access only to the necessary modules instead of the whole application. The same applies to front-end freelancers ecc... I have lived this situation as well. The company wanted database experts to fix the queries done by the application but did not want them to have access to the whole code. At the end, they renounced to the database optimization because that would have exposed too much sensitive information externally.
And what are the advantages of DTO / DB model separation?
DTO's will not touch the database. This gives you more security against attacks coming from the outside
You can decide what goes on the other side. Your DTO's do not need to implement all the fields as the db model. Actually you can even have a DAO map to many DTO's or the other way around. There is lots of information that shouldn't reach the front-end, and with the DTO's you can easily do that.
DTO are in general liter than #Entity models. Whereas entities are mapped (e.g #OneToMany) to other entities, DTO's may just contain the id field of the mapped objects.
You do not want to have database objects hanging around for too long; and neither being passed around by methods of your application. Many framework commit database transactions at the end of each method, which means any involuntary change done onto the database entity may be committed into the db.
Personally, I believe that any respectful web-application should strongly separate layers, each with its responsibility and limited visibility to other layers.
Differentiation between database models and data transfer objects is also a good pattern to follow.
At the end this is only my opinion though; many argue that the DTO pattern is outdated and causes unnecessary code repetition any many argue that to much separation leans to difficulty in maintaining the code. So, you should always consult different sources and then apply what works best for you.
Also interesting:
SE: What is the point of using DTO (Data Transfer Objects)?
Lessons Learned: Don't Expose EF Entities to the Client Directly
Guice Tutorial – method interception (old but gold)
SO: Large Enterprise Java Application - Modularization
Microsoft Docs: Layered Application Guidelines
The 5-layer architecture
I read several articles informing that entity beans in a Java EE environment are considered as anemic (means only containing getters and setters without implementing behaviour).
What prevents me to put behaviour into entity beans ? So that session beans (stateless or stateful) could delegate all business logics to them (for logic making sense to be owned by entities).
I don't see why entity beans are necessarily anemic.
From a pure semantic perspective, you would expect an ENTITY bean to be a representation of an entity and its attributes. If you couple this with some logic, then you add additional responsibility to your entity class. And as we know from the Curly's law or the single responsibility principle, each class should do one thing and one thing only:
http://www.codinghorror.com/blog/2007/03/curlys-law-do-one-thing.html
http://en.wikipedia.org/wiki/Single_responsibility_principle
If you believe that you have a strong enough reason to violate this principle, you can, but as far in my experience, no reason was strong enough to violate standard software engineering practices, especially if you, like me, believe that the quality of software is best represented by the quality of its code.
There are no restrictions of implementing functionality on entity beans, but they're not meant to be used throughout your application so most of the time you'll add behaviors that modify entities on your Session Beans just because Session Beans are supposed to be accessed from the front end for example.
If we go deeper, session bean methods are usually decorated with transactional and security aspects while entity beans are not, so your application may not behave the expected way if you added code into entities.
With the introduction of Hibernate in my project, my code started getting really coupled, and boilerplate in many places (and it should be the other way round, right?)
I got pretty confused by a particular example. I've always considered DAO objects to be pretty generic in their nature (mostly encapsulating the basic CRUD oeprations as well as the backend storage implementation)
Unfortunately, as my entity classes started to get more complicated, I started offloading more and more logic to the DAO objects. I have a particular example:
my entity class User should have a relation called friends, which is essentially a collection of users. However, I have to map my class to a collection of UserFriendship objects instead, each of which contains a ref to the friend object, but also other specific friendship data (the date when the friendship occurred)
Now, it is easy to introduce a custom getter in the entity class, which will take the collection of UserFriendship objects and turn it into a collection of User objects instead. However, what if I need only a subset of my friends collection, say, like in paging. I cannot really do that in the entity object, because it doesn't have access to the session, right? This also applies to when I need to make a parametrized query on the relationship. The one that has the access to the session is the UserDAO. So I ended up with this
UserDAO
=> normal CRUD methods
=> getFriends(Integer offset, Integer limit);
=> a bunch of similar getters and setters responsible for managing the relationships within the User instance.
This is insane. But I cannot really do anything else. I am not aware if it is possible to declare computed properties within the entity classes, which could also be parametrized.
I could technically also wrap the DAO within the entity, and put the helper getters and setters back into the entity class, where they should be, but I am not sure whether if that is a good practice as well.
I know that the DAO should only be accessed by the controller object, and it should provide a more or less complete entity object or a set of entity objects.
I am deeply confused. More or less all of my DAO objects now couple logic that should be either in the Entity objects or in the controllers.
I am sorry if my question is a bit confusing. It is a bit hard to formulate it.
My general rules are:
in the entity classes, respect the law of Demeter: don't talk to strangers
the entity classes must not use the session
the controller/service classes must not use the session. They may navigate in the graph of entities and call DAO methods
DAO methods should be the ones using the session. Their work consists in getting, saving, merging entities and executing queries. If several queries or persistence-related actions should be executed for a single use-case, the controller/service should coordinate them, not the DAO.
This way, I can test the business logic relatively easily by mocking the DAOs, and I can test the DAOs relatively easily because they don't contain much logic. Most of the tests verify that the queries find what they're supposed to find, return them in the appropriate order, and initialize the associations that must be initialized (to avoid lazy loading exceptions in the presentation layer, where I'm using detached objects)
I need some advice on designing an "Integration Tier" of an N-Tiered system in Java. This tier is responsible for persisting and retrieving data for the "Business Tier" (located on a separate server). I'm new to J2EE and I've read a few books and blogs. The alphabet soup of technology acronyms is confusing me so I have a handful of questions.
First, what I have so far: I'm using JPA (via Hibernate) for persisting and retrieving data to a database. I made my data access objects EJBs and plan on deploying to an application server (JBoss) which makes transactions easier (they're at the function level of my DAOs) and I don't have to worry about getting a handle to an EntityManager (dependency injection). Here's an example of what things look like:
#Entity
class A{
#Id
Long id;
#OneToMany
List<B> setOfBs = new ArrayList<B>;
}
#Entity
class B{
#Id
Long id;
}
#Remote
public interface ADAO{
public A getAById(Long id);
}
#Stateless
class ADAOImpl implements ADAO{
#PersistenceContext
EntityManager em;
public A getAById(Long id){ ... }
}
My question: How should the Business Tier exchange data with the Integration Tier. I've read up on RESTful services, and they seem simple enough. My concern is performance when the frequency of gets and sets increases (HTTP communication doesn't seem particularly fast). Another option is RMI. My DAOs are already EJBs. Could I just have the Business Tier access them directly (via JNDI)? If so, what happens if the #OneToMany link in the example above are lazily loaded?
For example if the Business Tier does something like the following:
Context context = new InitialContext(propertiesForIntegrationTierLookup);
ADAOImpl aDao = (ADAOImpl) context.lookup("something");
A myA = aDao.getAById(0);
int numberOfBs = myA.setOfBs.size();
If the setOfBs list is loaded lazily, when the Business Tier (on a separate server) accesses the list, is the size correct? Does the list somehow get loaded correctly through the magic of EJBs? If not (which I expect), what's the solution?
Sorry for the long post. Like I said I'm new to J2EE and I've read enough to get the general idea, but I need help on fitting the pieces together.
When you call size() on lazy collection, it gets initialized, so you'll always get correct size no matter which interface you're using - Remote or Local.
Another situation is when you're trying to use JPA classes as data transfer objects (DTO) and request them via Remote interface. I don't remember any lazy initialization issues here, cause prior to transmission all objects have to be serialized (with lazy collections initialized) on server side. As a result, the whole object graph is passed over network, which might cause serious cpu and network overheads. In addition, for deserialization to be possible, you will have to share JPA classes with remote app. And that's where and how 'EJB magic' ends :)
So, once remote calls are possible, I'd suggest to start thinking of data transfer strategy and non-JPA data transfer objects as additional data layer. In my case, I've annotated DTO classes for XML binding (JAXB) and reused them in web-services.
Short answer: If you are using an "Integration Layer" approach, the things you should be integrating should be loosely coupled services, following SOA principles.
This means you should not be allowing remote calls to methods on entities that could be making calls to the framework under the lid on another server. If you do this, you are really building a tightly coupled distributed application and you will have to worry about the lazy loading problems and the scope of the persistence context. If you want that, you might like to consider extended persistence contexts http://docs.jboss.org/ejb3/docs/tutorial/extended_pc/extended.html.
You have talked about a "business tier", but JPA does not provide a business tier. It provides entities and allows CRUD operations, but these are typically not business operations. a "RegisterUser" operation is not simply a question of persisting a "User" entity. Your DAO layer may offer a higher level of operation, but DAOs are typically used to put a thin layer over the database, but it is still very data centric.
A better approach is to define business service type operations and make those the services that you expose. You might want another layer on top of your DAO or you might want to have one layer (convert your DAO layer).
You business layer should call flush and handle any JPA exceptions and hide all of that from the caller.
The issue of how to transfer your data remains. In many cases the parameters of your business service requests will be similar to your JPA entities, but I think you will notice that often there are sufficient differences that you want to define new DTOs. For example, a "RegisterUser" business operation might update both the "User" and "EmailAddresses" table. The User table might include a "createdDate" property which is not part of the "RegisterUser" operation, but is set to the current date.
For creating DTOs, you might like to look at Project Lombok.
To copy the DTO to the Entity, you can use Apache Commons BeanUtils (e.g., PropertyUtils.copyProperties) to do a lot of the leg work, which works if the property names are the same.
Personally, I don't see the point in XML in this case, unless you want to totally decouple your implementations.