In my REST API project (Java 8, Spring Boot 2.3.1) I have a problem with some queries triggering massive query chains by loading lazy relations, even though the related objects are never accessed.
I have a UserEntity and a polymorphic CompanyEntity that are related with a ManyToMany relationship. I have an endpoint that returns all users and I include the IDs of the related companies in the JSON. I excpect a query to the user table and a query to the company table, however all related entities of one sub-entity of CompanyEntity are always loaded for each of those sub-entities resulting in large query chains.
Here are snippets of my classes:
User entity
#Entity(name = "USERS")
public class UserEntity {
#Id
#GeneratedValue
private UUID id;
#EqualsAndHashCode.Exclude
#Fetch(FetchMode.SUBSELECT)
#ManyToMany(fetch = FetchType.LAZY)
#JoinTable(
name = "users_company",
joinColumns = #JoinColumn(name = "USER_ID"),
inverseJoinColumns = #JoinColumn(name = "COMPANY_ID")
)
private Set<CompanyEntity> companies = new HashSet<>();
public List<UUID> getCompanyIds() {
return companies.stream()
.map(CompanyEntity::getId)
.collect(Collectors.toList());
}
}
Polymorphic company entity
#Entity(name = "COMPANY")
#Inheritance(strategy = InheritanceType.JOINED)
public abstract class CompanyEntity {
#Id
#GeneratedValue
private UUID id;
#Fetch(FetchMode.SUBSELECT)
#ManyToMany(mappedBy = "companies", fetch = FetchType.LAZY)
private Set<UserEntity> users = new HashSet<>();
}
Concrete company subclass that triggers the problem
#Entity(name = "CUSTOMER")
public class CustomerEntity extends CompanyEntity {
#NotNull
#OneToOne(cascade = {CascadeType.PERSIST, CascadeType.MERGE}, fetch = FetchType.LAZY)
private ContactPersonEntity contactPerson;
#Fetch(FetchMode.SUBSELECT)
#OneToMany(cascade = {CascadeType.PERSIST, CascadeType.MERGE}, fetch = FetchType.LAZY, mappedBy = "customer")
private Set<TransactionEntity> transactions = new HashSet<>();
public Set<UUID> getTransactionIds() {
return this.transactions.stream()
.map(TransactionEntity::getId)
.collect(Collectors.toSet());
}
}
In the REST controller I return the following mapping:
#GetMapping(value = "", produces = MediaType.APPLICATION_JSON_VALUE)
public List<UserReadModel> getUsers() {
return userRepository.findAll().stream()
.map(userEntity -> new UserReadModel(userEntity))
.collect(Collectors.toList());
}
Where the UserReadModel is a DTO:
#Data
public class UserReadModel {
private UUID id;
private List<UUID> companyIds;
}
Logging the database queries results in the following output:
// Expected
Hibernate: select userentity0_.id as id1_47_, ... from users userentity0_
Hibernate: select companies0_.user_id ... case when companyent1_1_.id is not null then 1 when companyent1_2_.id is not null then 2 when companyent1_.id is not null then 0 end as clazz_0_ from users_company companies0_ inner join company companyent1_ on companies0_.company_id=companyent1_.id left outer join customer companyent1_1_ on companyent1_.id=companyent1_1_.id left outer join external_editor companyent1_2_ on companyent1_.id=companyent1_2_.id where companies0_.user_id in (select userentity0_.id from users userentity0_)
// Unexpected as they are marked lazy and never accessed
Hibernate: select contactper0_.id ... from contact_person contactper0_ where contactper0_.id=?
Hibernate: select transactio0_.customer_id ... from transactions transactio0_ where transactio0_.customer_id=?
Hibernate: select contactper0_.id ... from contact_person contactper0_ where contactper0_.id=?
Hibernate: select transactio0_.customer_id ... from transactions transactio0_ where transactio0_.customer_id=?
...
I've read through loads of articles on entity mapping and lazy loading but I can't seem to find a reason why this behavior persists. Did anyone have this problem before?
You are accessing the collection, so Hibernate has to load the collection. Since you only need the ids and already have a DTO, I think this is a perfect use case for Blaze-Persistence Entity Views.
I created the library to allow easy mapping between JPA models and custom interface or abstract class defined models, something like Spring Data Projections on steroids. The idea is that you define your target structure(domain model) the way you like and map attributes(getters) via JPQL expressions to the entity model.
A DTO model for your use case could look like the following with Blaze-Persistence Entity-Views:
#EntityView(UserEntity.class)
public interface UserReadModel {
#IdMapping
UUID getId();
#Mapping("companies.id")
Set<UUID> getCompanyIds();
}
Querying is a matter of applying the entity view to a query, the simplest being just a query by id.
UserReadModel a = entityViewManager.find(entityManager, UserReadModel.class, id);
The Spring Data integration allows you to use it almost like Spring Data Projections: https://persistence.blazebit.com/documentation/entity-view/manual/en_US/index.html#spring-data-features
Page<UserReadModel> findAll(Pageable pageable);
The best part is, it will only fetch the state that is actually necessary! In your case, a query like the following will be generated:
select u.id, uc.company_id
from users u
left join users_company uc on uc.user_id = u.id
left join company c on c.id = uc.company_id
Depending on the Hibernate version, the join for the company might even be omitted.
I eventually figured out the solution and want to post it here, in case anyone stumbles upon this question. This was purely a mistake on my side and is not reproducible from the examples I posted.
I used lombok annotations to generate equals and hashcode methods on the customer entity (and all other entities for that matter) and forgot to annotate the contactPerson and transactions fields with #EqualsAndHashcode.Exclude. As the equals method was called somewhere along the execution, it triggered the lazy loading of those fields. Implementing equals and hashcode manually and using the guidelines from this article for that solved the problem.
Related
I have 2 entities: EntityA and EntityB. They are related with a One To Many relation.
public class EntityA {
#Identifier
#GeneratedValue(strategy = GenerationType.AUTO)
#Column(name="ID", updatable = false, nullable = false)
private long id;
#OneToMany(cascade = CascadeType.ALL, orphanRemoval = true)
#JoinColumn(name="ENTITY_A_ID", referencedColumnName="ID", nullable=true)
private List<EntityB> entityBs;
/* GETTERS SETTERS ... */
}
public class EntityB {
#Identifier
#GeneratedValue(strategy = GenerationType.AUTO)
#Column(name="ID", updatable = false, nullable = false)
private long id;
#Column(name="SOME_PROPERTY")
private String someProperty;
#ManyToOne
#JoinColumn(name="ENTITY_A_ID")
private EntityA entityA;
/* GETTERS SETTERS ... */
}
I have a query that joins EntityA with a LEFT JOIN to Entity B. And a 'ON' clause.
In normal SQL lingo this would be:
select * from EntityA eA left join EntityB eB
on (eA.ID = eB.ENTITY_A_ID and eB.SOME_PROPERTY = "blabla" )
where ...
So I'm having much needed information from my joined resultset. I only want records joined if they match certain properties. I need EntityA, allways, and an attached EntityB if EntityB matched the join clause.
The project is set up with Hibernate / JPA. I can't figure out how to retreive the information needed. At this moment I have:
public class EntityADAO {
public List<EntityA> findMethod() {
CriteriaBuilder builder = entityManager.getCriteriaBuilder();
CriteriaQuery<EntityA> query = builder.createQuery(EntityA.class);
Root<EntityA> entityARoot = query.from(EntityA.class);
Join<EntityA, EntityB> entityBJoin = entityARoot.join("entityB", JoinType.INNER);
entityBJoin.on(new Predicate [] {builder.equal(entityBJoin.get("someProperty"), "fixed_val_for_now"});
/* where clause left out for readability */
TypedQuery<EntityA> q = entityManager.createQuery(query);
return q.getResultList();
}
}
So here I am.. Stuck with my List of EntityAs. whenever I call getEntityBs() on a EntityA, I'm getting all of them.. And this makes sense.. But How can I retrieve the joined set?
I'm stuck with JPA and Hibernate, as this choice is not made by me.
Thanks in advance!
What you need here is a custom projection or DTO. Filtering the entity collection might cause a delete because entities always reflect the current DBMS state and are synchronized at the end of the transaction.
You can write a JPQL query, just like the SQL one, that does what you want.
SELECT a.id, b.id
FROM EntityA a
LEFT JOIN EntityB b ON a.id = b.entityA.id AND b.someProperty = 'blabla'
But this won't help you with the materialization of the results into rich objects. If an Object[] i.e. the tuples are good enough for your use case, then use this kind of query and be done, but if you want to map to rich objects, I can recommend that you take a look at Blaze-Persistence Entity-Views.
Blaze-Persitence is a query builder on top of JPA which supports many of the advanced DBMS features on top of the JPA model. I created Entity Views on top of it to allow easy mapping between JPA models and custom interface defined models, something like Spring Data Projections on steroids. The idea is that you define your target structure the way you like and map attributes(getters) via JPQL expressions to the entity model. Since the attribute name is used as default mapping, you mostly don't need explicit mappings as 80% of the use cases is to have DTOs that are a subset of the entity model.
A mapping for your model could look as simple as the following
#EntityView(EntityA.class)
public interface EntityAView {
long getId();
#Mapping("entityBs[someProperty = 'blabla']")
List<EntityBView> getEntityBs();
}
#EntityView(EntityB.class)
public interface EntityBView {
long getId();
}
Querying is a matter of applying the entity view to a query, the simplest being just a query by id.
EntityAView dto = entityViewManager.find(entityManager, EntityAView.class, id);
The Spring Data integration allows you to use it almost like Spring Data Projections: https://persistence.blazebit.com/documentation/entity-view/manual/en_US/index.html#spring-data-features
I have entity Workflow which has #OneToMany relation with ValidationResults class. It's fetch Lazy but sometimes I would like to get all the Workflows and interate on them accessing the ValidationResults. In that moment I want jpa to get all the data eagerly not query each time I access ValidationResults. I use springDataJpa, How to do it, is there any way to do it with #Query ?
I try to achieve something like that but I don't know how
//here all the workflows has corresponding data eagerly
List<Workflow> workflows = workflowService.getAllWorkflowsWithValidationResultsEagerly();
//here validationResults ref is lazy, when I try to access it it does query
List<Workflow> workflows = workflowService.getAllWorkflowsUsually();
Here are my entities.
#Entity
#Table(name = "workflow")
public class Workflow {
..............
#OneToMany(fetch = FetchType.LAZY, mappedBy = "workflow", cascade = CascadeType.ALL)
private Set<ValidationResults> validationResultsSet = new HashSet<>();
public Set<ValidationResults> getValidationResultsSet(){return this.validationResultsSet;}
...............
}
And ValidationResult class
#Entity
#Table(name = "validation_results")
public class ValidationResults {
...
#ManyToOne
#JoinColumn(name = "workflow_id", insertable = false, updatable = false)
private Workflow workflow;
....
}
The spring boot-ish way of doing this is by using the #EntityGraph as described in the documentation.
You can use fetch join in order to do it on #Query https://www.logicbig.com/tutorials/java-ee-tutorial/jpa/fetch-join.html
#Query("SELECT DISTINCT e FROM Employee e INNER JOIN FETCH e.tasks t")
If you don't want to create another query, just call .size() of your list
I have a list of projects and a list of customers. A project can be for one customer and every customer can have many projects. So it's a simple 1:n relationship where the project is the owning side.
Simplified to the essential it is
#Entity
public class Project {
#Id
long id;
#ManyToOne(optional = true)
#JoinColumn(name = "customer", nullable = true, updatable = true)
Customer customer;
}
#Entity
public class Customer {
#Id
long id;
}
When I load a list of projects, I want to retrieve the customers efficiently at the same time. This is not the case. There is one single query for the projects and then for every distinct customer that is encountered a separate query is issued.
So say I have 100 projects that are assigned to 50 different customers. This would result in one query for the projects and 50 queries for the customers.
This quickly adds up and for large project/customer lists our application gets rather slow. Also this is just one example. All our entities with relationships are affected by this behavior.
I already tried #Fetch(FetchMode.JOIN) on the customers field as suggested here but it does nothing and FetchMode.SUBQUERY is not applicable according to Hibernate:
org.hibernate.AnnotationException: Use of FetchMode.SUBSELECT not allowed on ToOne associations
How can I fix this problem?
If you are using Spring Data JPA to implement your repositories, you can specify lazy fetching in the JPA entities:
#Entity
public class Project {
#Id
long id;
#ManyToOne(fetch = FetchType.LAZY, optional = true)
#JoinColumn(name = "customer", nullable = true, updatable = true)
Customer customer;
}
#Entity
public class Customer {
#Id
long id;
...
}
And add #EntityGraph to your Spring Data JPA-based repository:
#Repository
public interface ProjectDao extends JpaRepository<Project, Long> {
#EntityGraph(
type = EntityGraphType.FETCH,
attributePaths = {
"customer"
}
)
Optional<Project> findById(Long id);
...
}
My blog post at https://tech.asimio.net/2020/11/06/Preventing-N-plus-1-select-problem-using-Spring-Data-JPA-EntityGraph.html helps you preventing the N+1 select problem using Spring Data JPA and #EntityGraph.
Yes, it is a by-the-book example of the n+1 selects problem.
The approach I use in most cases is to make the association lazy and define a batch size.
Alternatively, you could use a JPQL query with [left] join fetch to initialize the association directly from the query result set:
select p from Project p left join fetch p.customer
Yes, it is a by-the-book example of the n+1 selects problem as #dragan-bozanovic said.
In Spring-Boot 2.1.3 #Fetch(FetchMode.JOIN) can be used to solve it:
#ManyToOne(optional = true)
#Fetch(FetchMode.JOIN)
#JoinColumn(name = "customer", nullable = true, updatable = true)
Customer customer;
Warning: If the relationship can be invalid, for example when marked with #NotFound(action = NotFoundAction.IGNORE), each invalid relationship will trigger another SELECT query.
Consider the following model
#Entity
// JPA and JAXB annotations here
public class Employee implements Serializable {
// other fields, annotations, stuffs
...
#ElementCollection(fetch = FetchType.LAZY,
targetClass = Address.class)
#CollectionTable(name = "employee_address",
schema = "hris",
joinColumns = #JoinColumn(name = "employee_id",
nullable = false,
referencedColumnName = "employee_id",
foreignKey = #ForeignKey(ConstraintMode.CONSTRAINT)))
protected Set<Address> addresses;
// setters, getters
...
}
#Embeddable
// JAXB annotations here
public class Address implements Serializable {
// fields, setters, getters
}
The Address class is annotated with #Embeddable annotation, and the Employee class has an embedded element collection of addresses. The element collection's fetch is set to FetchType.LAZY. Now, I would like to create a #NamedQuery that would retrieve all employees with addresses eagerly initialized. Knowing that JOIN FETCH will only work with entity collections annotated with #OneToMany or #ManyToMany based on JPA 2.1, how would I create a valid JPQL query that would allow me to eagerly retrieve embedded element collections?
In the JPA 2.1 specification (JSR 338) I cannot find any hint that fetch joins only work on entity relationships (but not embeddables). JSR 338, section 4.4.5.3 even states:
A FETCH JOIN enables the fetching of an association or element collection as a side effect of the execution of a query.
As another hint the following minimal example (essentially resembling yours) executed with Hibernate 4.3.11 as JPA provider results in a single query:
Address embeddable:
#Embeddable public class Address { private String city; }
Employee entity:
#Entity public class Employee {
#Id private Long id;
#ElementCollection(fetch = FetchType.LAZY)
#CollectionTable(name = "address",
joinColumns = #JoinColumn(name="employee_id"))
private Set<Address> addresses;
}
JPQL Query:
em.createQuery("select e from Employee e join fetch e.addresses").getResultList();
Resulting SQL query:
select
employee0_.id as id1_1_,
addresses1_.employee_id as employee1_1_0__,
addresses1_.city as city2_5_0__
from
Employee employee0_
inner join
address addresses1_ on employee0_.id=addresses1_.employee_id
So the above JPQL query seems to solve your problem.
By the way, more effective way might be not to use join, but subselect
#Fetch(FetchMode.SUBSELECT)
#BatchSize(size=500)
it makes two selects, instead of one, but doesn't produce so much ambiguity.
JBoss EAP 6
Hibernate 4
I have a J2EE application with a web browser client. ( Apache click )
Both the internal business logic and the client use the same entity objects.
I would like to have all relations in the entities set to lazy loading. This way I have good performance.
But when using the entities in the client ( that is the server side code of apache click ) I would need a lot of the relations to be eager loaded. The client code is accessing the back-end through a session bean.
So I have a couple of ways I can solve this:
Create 2 of each JPA entities, one with eager loading and one with lazy loading. And then use the one with eager loading in the client, and the one with lazy loading in the server. Most of the server logic will be in a transaction, so lazy loading is fine here.
Make all relations lazy loading. When accessing the entities from the client, make sure there is a transaction. ( #TransactionAttribute(TransactionAttributeType.REQUIRED) )
and code the access to the necessary fields so they are accessible after session bean call.
But that means that I have to start a transaction when that is not required, i.e. if I am only getting some objects. And I have to maintain more code. And I have to know exactly what relations the client needs.
Create an inheritance hierarchy, where I have a super entity, and then 2 child, one with objects relations lazy loaded, and one with only values, no objects. i.e. :
Super
#MappedSuperclass
public class SuperOrder {
#Id
#Column(name = "id")
#GeneratedValue(.....)
private Long id;
#Column(name = "invoice", length = 100)
private String invoice;
Child 1
#Entity
#Inheritance(strategy = InheritanceType.SINGLE_TABLE)
#Table(name = "testorder")
#SequenceGenerator(....)
public class Order extends SuperOrder {
#ManyToOne(targetEntity = PrintCustomerEnt.class, fetch = FetchType.EAGER, optional = true)
#JoinColumn(name = "print_customer_id", nullable = true)
#ForeignKey(name = "fk_print_customer")
#Valid
private PrintCustomerEnt printCustomer;
public PrintCustomerEnt getPrintCustomer() {
return printCustomer;
}
public void setPrintCustomer(final PrintCustomerEnt printCustomer) {
this.printCustomer = printCustomer;
}
}
Child 2
#Entity
#Inheritance(strategy = InheritanceType.SINGLE_TABLE)
#Table(name = "testorder")
#SequenceGenerator(...)
public class LazyOrder extends SuperOrder {
#Transient
private String printCustomerName;
#Column(name = "print_customer_id", nullable = true)
private Long printCustomerId;
What is the best practice... or is there something other good way to do this.
Basically the problem is I want to use the same entities in different scenarios. Sometimes I need eager loading, and sometimes I need lazy loading.
I suggest that you create just one JPA entity with lazy relationships, and when you need to load eagerly some of them create a Service that uses JPQL(HQL) to do some FETCH trick. The idea is one JPA entity and many services.
I've been programing in JPA 2 for some a while now, and I can say there are couple of now written rules that I almost always apply:
Use LAZY Inicialization on all your OneToMany, ManyToMany Relations
Use EAGER Inicalization on all your OneToOne, ManyToOne Relations
This rules apply on 99% of my projects. I think these are best practices due to my personal experience and some research I've been doing.
Note: I must say I do not use JOIN FETCH on Lazy Inicialization, instead I write a Prefetch Method. Example:
#Entity
class Entity{
#Id
#GeneratedValue(strategy = GenerationType.IDENTITY)
#Basic(optional = false)
private Integer id;
#OneToMany(cascade = CascadeType.ALL, mappedBy = "mappedName",
orphanRemoval = true)
private List<Child1> collection1;
#OneToMany(cascade = CascadeType.ALL, mappedBy = "mappedName",
orphanRemoval = true)
private List<Child2> collection2; }
And then we have the Controller:
class EntityController{
public Entity findCompraFolioFull(Integer id) {
EntityManager em = getEntityManager();
try {
Entity entity = em.find(Entity.class, id);
//Initialize Collections inside Transaccion, this prevents
//LazyInizialization No Proxy Exception later in code when calling
//hollow collections
cp.getCollection().size();
cp.getCollection().size();
return cp;
} finally {
em.close();
}
}
}
I don't recomend FETCH JOIN
public Entity findEntityByJoinFetch(Integer id) {
EntityManager em = getEntityManager();
try {
TypedQuery<Entity> tq = em.createQuery(
"SELECT e FROM Entity e\n"
+ "LEFT JOIN FETCH e.collection1\n"
+ "LEFT JOIN FETCH e.collection2\n"
+ "WHERE e.id = :id", Entity.class);
tq.setParameter("id", id);
return tq.getSingleResult();
} finally {
em.close();
}
}
Reasons I don't recomend Fetch Join Appoach:
If your collections are java.util.List type then this getSingleResult() will fail in hibernate due to lack of capacity to fetch MultipleBags without indexing notations on your OneToMany Relation.
You can always change the type of your collections to java.util.set in order to multiple bags to be fetched but this brings new kind of situations to deal with, Sets aren't ordered and HashCode() method won't work correctly so you'll have to #Override it inside Children Classes, and if you are using JAVAFX TableView to bind model to Items you won't be able to bind collections Set Type to Item Property of TableView, not directly at least.