spring data jpa unnecessary left join - java

I have the following model:
I want to get all Institutions (Intituciones) with specified sectorId.
In the tbInstitucion model I have a relationship with tbSector:
#ManyToOne(fetch=FetchType.LAZY)
#JoinColumn(name="`sectorId`")
private Sector sector;
is there a way to obtain a query like:
select *
from tbInstitucion
where sectorId = ?
I tried with: findBySector(Sector sector)
but with this I need an additional query to find the sector and findBySector is generating the following query:
select
generatedAlias0.institucionId,
generatedAlias0.institucionNombre
from
Institucion as generatedAlias0
left join
generatedAlias0.sector as generatedAlias1
where
generatedAlias1=:param0
tried with this other one:
findBySector_sectorId
which generates the above query as well.
Wouldn't be better to form a query like:
select *
from tbInstitucion
where sectorId = ?
Is there a way to get the above query?
Why is JPA generating the left join?

A quick review of the entity model
#Entity
class Institucion {
#ManyToOne(fetch=FetchType.LAZY)
#JoinColumn(name="`sectorId`")
private Sector sector;
}
is equivalent to:
#Entity
class Institucion {
#ManyToOne(cascade = {}
, fetch=FetchType.LAZY
, optional = true
, targetEntity = void.class)
#JoinColumn(columnDefinition = ""
, foreignKey = #ForeignKey
, insertable = true
, name="`sectorId`"
, nullable = true
, referencedColumnName = ""
, table = ""
, unique = false
, updatable = false)
private Sector sector;
}
Note #ManyToOne(optional = true) and #JoinColumn(nullable = true). This signifies to the ORM that the sector attribute of Institucion is optional and may not be set (to a non-null value) all the time.
How the entity model impacts repository queries
Now consider the following repository:
public interface InstitucionRepository extends CrudRepository<Institucion, Long> {
List<Institucion> findAllByInstitucionNombre(String nombre);
List<Institucion> findAllByInstitucionEmail(String email);
}
Given the entity declaration above, the repository methods should produce queries such as:
select
generatedAlias0
from
Institucion as generatedAlias0
left join
generatedAlias0.sector as generatedAlias1
where
generatedAlias0.institucionNombre=:param0
and
select
generatedAlias0
from
Institucion as generatedAlias0
left join
generatedAlias0.sector as generatedAlias1
where
generatedAlias0.institucionEmail=:param0
This is because the entity model indicates sector to be optional so the ORM needs to load Institucions without worrying about their sectors.
Following this pattern, the following repository method:
List<Institucion> findAllBySector(Sector sector);
translates to:
select
generatedAlias0
from
Institucion as generatedAlias0
left join
generatedAlias0.sector as generatedAlias1
where
generatedAlias1=:param0
Solution 1
If Institucion.sector is not optional, make it mandatory in the model too:
#ManyToOne(fetch=FetchType.LAZY, optional = false)
#JoinColumn(name="`sectorId`", nullable = false)
private Sector sector;
Solution 2
If Institucion.sector is indeed optional, only a manual query such as the one shown in #MaciejKowalski's answer will work.
Simplified query
The following query will also work:
List<Institucion> findAllBySectorSectorId(Long id);
This assumes that the model attribute names are exactly as shown in the post.

Left join is a default implicit joining strategy, also when using the #EntityGraph feature.
I would recommend using explicit #Query definition:
#Query("select i from institution i inner join i.sector s where s.id = :sectorId")
public Institution getBySector(#Param("sectorId") Integer sectorId);

Related

Spring Data JPA retrieve entities where root field or list entity field matches criteria

The following entity relationship exists: A 1 <-> * B (edited down for brevity)
#Entity
public class EntityA {
#Id
UUID id;
#Column
String searchText;
#OneToMany(mappedBy = "entity_a", fetch = FetchType.LAZY)
List<EntityB> listOfB;
}
#Entity
public class EntityB {
#Id
UUID id;
#Column
String searchText;
#ManyToOne(fetch = FetchType.LAZY)
#JoinColumn(name = "entity_a_id")
EntityA entityA;
}
I would like to retrieve all entities (EntityA) where a specific field (searchText) matches a criteria - easy enough. Now I want to go through all linked entities (EntityB) with a field (searchText) matching the criteria as well, only returning the matching entities. This result can be in any format for eg:
Map<EntityA, List<EntityB>> (The parent EntityA, possibly matching, with matching children EntityB)
List<EntityA_EntityBSearchResult> (some custom domain model)
List<EntityA> (bad: don't use entities with not conforming to expected relationship)
There are two main constrains complicating this
The outer entity (EntityA) should stay pageable
The resulting list (of EntityB) is limited to a maximum of 10 items
What I've tried:
Building the SQL before converting to JPA Query
For simplicity the criteria is a simple like '%textToFind%' on field searchText
1. Retrieve all of EntityB where EntityA or EntityB matches criteria.
select a.*, b.* from entity_a a
inner join entity_b b on b.entity_a_id = a.id
where a.search_text like '%textToFind%' or b.search_text like '%textToFind%'
Good: Pageable on EntityA
Bad: EntityB not limited to max of 10 items. Grouping will have to be done afterwards to Map<EntityA, List<EntityB>>.
2. Retrieve all of EntityB where EntityA or EntityB matches criteria - limited to 10 of EntityB.
select * from entity_b where id in (
select limitedB.id from (
select ROW_NUMBER() OVER(PARTITION BY a.id ORDER BY b.created_date_time DESC) AS RowNumber, b.* from entity_a a
inner join entity_b b on b.entity_a_id = a.id
where a.search_text like '%textToFind%' or b.search_text like '%textToFind%') limitedB
WHERE RowNumber <= 10)
Good: EntityB limited to 10
Bad: Not pageable on EntityA
3. Retrieve all of EntityA first (where EntityA or EntityB matches criteria). Followed by retrieval of EntityB (where EntityA or EntityB matches criteria) with parent EntityA id in previous list.
select DISTINCT a.* from entity_a a
inner join entity_b b on b.entity_a_id = a.id
where a.search_text like '%textToFind%' or b.search_text like '%textToFind%'
use the result to create an EntityA id list (List<UUID> EntityA_Ids). Send this list to the next query:
select * from entity_b where id in (
select limitedB.id from (
select ROW_NUMBER() OVER(PARTITION BY a.id ORDER BY b.created_date_time DESC) AS RowNumber, b.* from entity_a a
inner join entity_b b on b.entity_a_id = a.id
where a.search_text like '%textToFind%' or b.search_text like '%textToFind%') limitedB
WHERE RowNumber <= 10) and entity_a_id in EntityA_Ids
From this result a map (Map<EntityA, List<EntityB>>) can easily be created with Collectors.groupingBy.
Good: EntityA is pageable and EntityB is limited to 10
Bad: Multiple queries

How to get batching using the old hibernate criteria?

I'm still using the old org.hibernate.Criteria and get more and more confused about fetch modes. In various queries, I need all of the following variants, so I can't control it via annotations. I'm just switching everything to #ManyToOne(fetch=FetchType.LAZY), as otherwise, there's no change to change anything in the query.
What I could find so far either concerns HQL or JPA2 or offers just two choices, but I need it for the old criteria and for (at least) the following three cases:
Do a JOIN, and fetch from both tables. This is OK unless the data is too redundant (e.g., the master data is big or repeated many times in the result). In SQL, I'd write
SELECT * FROM item JOIN order on item.order_id = order.id
WHERE ...;
Do a JOIN, fetch from the first table, and the separation from the other. This is usually the more efficient variant of the previous query. In SQL, I'd write
SELECT item.* FROM item JOIN order on item.order_id = order.id
WHERE ...;
SELECT order.* FROM order WHERE ...;
Do a JOIN, but do not fetch the joined table. This is useful e.g., for sorting based on data the other table. In SQL, I'd write
SELECT item.* FROM item JOIN order on item.order_id = order.id
WHERE ...
ORDER BY order.name, item.name;
It looks like without explicitly specifying fetch=FetchType.LAZY, everything gets fetched eagerly as in the first case, which is sometimes too bad. I guess, using Criteria#setFetchMode, I can get the third case. I haven't tried it out yet, as I'm still missing the second case. I know that it's somehow possible, as there's the #BatchSize annotation.
Am I right with the above?
Is there a way how to get the second case with the old criteria?
Update
It looks like using createAlias() leads to fetching everything eagerly. There are some overloads allowing to specify the JoinType, but I'd need to specify the fetch type. Now, I'm confused even more.
Yes you can satisfy all three cases using FetchType.LAZY, BatchSize, the different fetch modes, and projections (note I just made up a 'where' clause with Restrictions.like("name", "%s%") to ensure that I retrieved many rows):
Do a JOIN, and fetch from both tables.
Because the order of an item is FetchType.LAZY, the default fetch mode will be 'SELECT' so it just needs to be set as 'JOIN' to fetch the related entity data from a join rather than separate query:
Session session = entityManager.unwrap(org.hibernate.Session.class);
Criteria cr = session.createCriteria(Item.class);
cr.add(Restrictions.like("name", "%s%"));
cr.setFetchMode("order", FetchMode.JOIN);
List results = cr.list();
results.forEach(r -> System.out.println(((Item)r).getOrder().getName()));
The resulting single SQL query:
select
this_.id as id1_0_1_,
this_.name as name2_0_1_,
this_.order_id as order_id3_0_1_,
order2_.id as id1_1_0_,
order2_.name as name2_1_0_
from
item_table this_
left outer join
order_table order2_
on this_.order_id=order2_.id
where
this_.name like ?
Do a JOIN, fetch from the first table and the separately from the other.
Leave the fetch mode as the default 'SELECT', create an alias for the order to use it's columns in sorting, and use a projection to select the desired subset of columns including the foreign key:
Session session = entityManager.unwrap(org.hibernate.Session.class);
Criteria cr = session.createCriteria(Item.class);
cr.add(Restrictions.like("name", "%s%"));
cr.createAlias("order", "o");
cr.addOrder(org.hibernate.criterion.Order.asc("o.id"));
cr.setProjection(Projections.projectionList()
.add(Projections.property("id"), "id")
.add(Projections.property("name"), "name")
.add(Projections.property("order"), "order"))
.setResultTransformer(org.hibernate.transform.Transformers.aliasToBean(Item.class));
List results = cr.list();
results.forEach(r -> System.out.println(((Item)r).getOrder().getName()));
The resulting first SQL query:
select
this_.id as y0_,
this_.name as y1_,
this_.order_id as y2_
from
item_table this_
inner join
order_table o1_
on this_.order_id=o1_.id
where
this_.name like ?
order by
o1_.id asc
and subsequent batches (note I used #BatchSize(value=5) on the Order class):
select
order0_.id as id1_1_0_,
order0_.name as name2_1_0_
from
order_table order0_
where
order0_.id in (
?, ?, ?, ?, ?
)
Do a JOIN, but do not fetch the joined table.
Same as the previous case, but don't do anything to prompt the loading of the lazy-loaded orders:
Session session = entityManager.unwrap(org.hibernate.Session.class);
Criteria cr = session.createCriteria(Item.class);
cr.add(Restrictions.like("name", "%s%"));
cr.createAlias("order", "o");
cr.addOrder(Order.asc("o.id"));
cr.setProjection(Projections.projectionList()
.add(Projections.property("id"), "id")
.add(Projections.property("name"), "name")
.add(Projections.property("order"), "order"))
.setResultTransformer(org.hibernate.transform.Transformers.aliasToBean(Item.class));
List results = cr.list();
results.forEach(r -> System.out.println(((Item)r).getName()));
The resulting single SQL query:
select
this_.id as y0_,
this_.name as y1_,
this_.order_id as y2_
from
item_table this_
inner join
order_table o1_
on this_.order_id=o1_.id
where
this_.name like ?
order by
o1_.id asc
My entities for all cases remained the same:
#Entity
#Table(name = "item_table")
public class Item {
#Id
#GeneratedValue(strategy = GenerationType.IDENTITY)
private Long id;
private String name;
#ManyToOne(fetch = FetchType.LAZY)
private Order order;
// getters and setters omitted
}
#Entity
#Table(name = "order_table")
#BatchSize(size = 5)
public class Order {
#Id
#GeneratedValue(strategy = GenerationType.IDENTITY)
private Long id;
private String name;
// getters and setters omitted
}

How to select index colum in Hibernate?

I have a database table Communications with type, value and a foreign key as index that maps back to a Person table declared as follows:
#Table(name = 'communication', schema = 'schema')
#org.hibernate.annotations.Table(appliesTo = 'communication', indexes = {
#Index(name = "idx_communication_person_id", columnNames = { "person_id" })
}
)
And the Person object maps to this as:
#OneToMany(fetch = LAZY, cascade = ALL, orphanRemoval = true)
#JoinColumn(name = "person_id")
#OrderColumn
#Index(name = "idx_communication_person_id")
private final List<Communication> communications
Now I want to create a HQL query with Hibernate, that selects based on this index colum, like:
WHERE person.id in ( SELECT c.person_id FROM Communication c WHERE c.type = 3 AND c.value = 'john.doe#server.com' )
That doesn't work, because HQL doesn't know c.person_id at this point, because index columns are in general unknown to HQL.
How do I properly address the index in HQL, or if that is not possible: how do I write the statement to archive the same as the native-like query above?
EDIT: For performance reasons there must not be a JOIN in any form.
I think you need something like this:
SELECT p.* FROM person p
JOIN p.communication c
WHERE c.type = 3 AND c.value = 'john.doe#server.com'
That doesn't work, because HQL doesn't know c.person_id at this point, because index columns are in general unknown to HQL.
This doesn't make much sense to me.
If you want to have an HQL statement that returns a list of identifiers for Person based on some criteria, you can easily do it much like how your SQL statement is written.
SELECT p.id
FROM Communication c JOIN FETCH c.person p
WHERE c.type = :communicationType
AND c.value = :emailAddress
If you actually want persons, just write the query to select c.person rather than p.id in order to hydrate all Persons. In the following, the query allows you to specify a person identifier on the predicate if needed.
SELECT c.person
FROM Communication c JOIN FETCH c.person p
WHERE c.person.id = :personId
AND c.type = :communicationType
AND c.value = :emailAddress
UPDATE
If you don't want to use any joins, then simply expose the personId as a numeric value on your Communication entity without any association mappings.
public class Communication {
#Column(name = "personId", nullable = false, insertable = false, updatable = false)
private Long personId;
}
You should then be able to issue a query such as:
SELECT c.personId
FROM Communcation c
WHERE c.type = :communicationType
AND c.value = :emailAddress

HQL. Intersection of two lists

Update: look my answer below on how to check if 2 list intersects (both for #ElementCollection with string/enums and usual entities list mapped like #OneToMany)
I have an entity which contains #ElementCollectionfield with enums.
public enum StatusType {
NEW, PENDING, CLOSED;
}
#Entity
public class MyEntity {
#ElementCollection
#CollectionTable(name = "status_type", joinColumns = {#JoinColumn(name = "my_entity_id")})
#Column(name = "status_type", nullable = false)
private Set<StatusType > statusTypes = new HashSet<StatusType >();
...
}
Now I want to get all entities which contains status NEW or PENDING (or both).
I'm trying to use this query:
SELECT DISTINCT u FROM MyEntity u WHERE u.statusTypes in :statusTypes
But I'm getting exception: org.postgresql.util.PSQLException: No value specified for parameter 9.
How to properly query on collections and filter by intersections?
Problem solved by adding JOIN clause to HQL. Hibernate couldn't implicitly recognize that query needs JOIN clause. May be it will help someone:
SELECT DISTINCT u FROM MyEntity u
LEFT JOIN u.statusTypes statusTypes
WHERE statusTypes in :statusTypes
I set the query params like this:
query.setParameter( "statusTypes", listOfStatusTypesEnums);
It will select rows where at least one element of listOfStatusTypesEnums list is present in entity's statusTypes property (i.e. if 2 list are intersects in some way).
If you have usual list of entities (which are not #ElementCollection, but #OneToMany etc), same rule will work as well. Just use like this: LEFT JOIN u.subEntities subEntities WHERE subEntities.id in :subEntityIds

Order by count using Spring Data JpaRepository

I am using Spring Data JpaRepository and I find it extremely easy to use. I actually need all those features - paging, sorting, filtering. Unfortunately there is one little nasty thing that seems to force me to fall back to use of plain JPA.
I need to order by a size of associated collection. For instance I have:
#Entity
public class A{
#Id
private long id;
#OneToMany
private List<B> bes;
//boilerplate
}
and I have to sort by bes.size()
Is there a way to somehow customize the ordering still taking the advantage of pagination, filtering and other Spring Data great features?
I've solved the puzzle using hints and inspirations from:
Limiting resultset using #Query anotations by Koitoer
How to order by count() in JPA by MicSim
Exhaustive experiments on my own
The first and most important thing I've not been aware of about spring-data is that even using #Query custom methods one can still create paging queries by simply passing the Pageable object as parameter. This is something that could have been explicitely stated by spring-data documentation as it is definitely not obvious though very powerful feature.
Great, now the second problem - how do I actually sort the results by size of associated collection in JPA? I've managed to come to a following JPQL:
select new package.AwithBCount(count(b.id) as bCount,c) from A a join a.bes b group by a
where AwithBCount is a class that the query results are actually mapped to:
public class AwithBCount{
private Long bCount;
private A a;
public AwithBCount(Long bCount, A a){
this.bCount = bCount;
this.a = a;
}
//getters
}
Excited that I can now simply define my repository like the one below
public interface ARepository extends JpaRepository<A, Long> {
#Query(
value = "select new package.AwithBCount(count(b.id) as bCount,c) from A a join a.bes b group by a",
countQuery = "select count(a) from A a"
)
Page<AwithBCount> findAllWithBCount(Pageable pageable);
}
I hurried to try my solution out. Perfect - the page is returned but when I tried to sort by bCount I got disappointed. It turned out that since this is a ARepository (not AwithBCount repository) spring-data will try to look for a bCount property in A instead of AwithBCount. So finally I ended up with three custom methods:
public interface ARepository extends JpaRepository<A, Long> {
#Query(
value = "select new package.AwithBCount(count(b.id) as bCount,c) from A a join a.bes b group by a",
countQuery = "select count(a) from A a"
)
Page<AwithBCount> findAllWithBCount(Pageable pageable);
#Query(
value = "select new package.AwithBCount(count(b.id) as bCount,c) from A a join a.bes b group by a order by bCount asc",
countQuery = "select count(a) from A a"
)
Page<AwithBCount> findAllWithBCountOrderByCountAsc(Pageable pageable);
#Query(
value = "select new package.AwithBCount(count(b.id) as bCount,c) from A a join a.bes b group by a order by bCount desc",
countQuery = "select count(a) from A a"
)
Page<AwithBCount> findAllWithBCountOrderByCountDesc(Pageable pageable);
}
...and some additional conditional logic on service level (which could be probably encapsulated with an abstract repository implementation). So, although not extremely elegant, that made the trick - this way (having more complex entities) I can sort by other properties, do the filtering and pagination.
One option, which is much simpler than the original solution and which also has additional benefits, is to create a database view of aggregate data and link your Entity to this by means of a #SecondaryTable or #OneToOne.
For example:
create view a_summary_view as
select
a_id as id,
count(*) as b_count,
sum(value) as b_total,
max(some_date) as last_b_date
from b
Using #SecondaryTable
#Entity
#Table
#SecondaryTable(name = "a_summary_view",
pkJoinColumns = {#PrimaryKeyJoinColumn(name = "id", referencedColumnName= "id")})
public class A{
#Column(table = "a_summary_view")
private Integer bCount;
#Column(table = "a_summary_view")
private BigDecimal bTotal;
#Column(table = "a_summary_view")
private Date lastBDate;
}
You can now then sort, filer, query etc purely with reference to entity A.
As an additional advantage you have within your domain model data that may be expensive to compute in-memory e.g. the total value of all orders for a customer without having to load all orders or revert to a separate query.
Thank you #Alan Hay, this solution worked fine for me. I just had to set the foreignKey attribute of the #SecondaryTable annotation and everything worked fine (otherwise Spring Boot tried to add a foreignkey constraint to the id, which raise an error for a sql View).
Result:
#SecondaryTable(name = "product_view",
pkJoinColumns = {#PrimaryKeyJoinColumn(name = "id", referencedColumnName = "id")},
foreignKey = #javax.persistence.ForeignKey(ConstraintMode.NO_CONSTRAINT))
I don't know much about Spring Data but for JPQL, to sort the objects by size of associated collection, you can use the query
Select a from A a order by a.bes.size desc
You can use the name of an attribute found in the select clause as a sort property:
#Query(value = "select a, count(b) as besCount from A a join a.bes b group by a", countQuery = "select count(a) from A a")
Page<Tuple> findAllWithBesCount(Pageable pageable);
You can now sort on property besCount :
findAllWithBesCount(PageRequest.of(1, 10, Sort.Direction.ASC, "besCount"));
I used nativeQuery to arrange sorting by number of records from another table, pagable works.
#Query(value = "SELECT * FROM posts where posts.is_active = 1 and posts.moderation_status = 'ACCEPTED' " +
"group by posts.id order by (SELECT count(post_id) FROM post_comments where post_id = posts.id) desc",
countQuery = "SELECT count(*) FROM posts",
nativeQuery = true)
Page <Post> findPostsWithPagination(Pageable pageable);
For SpringBoot v2.6.6, accepted answer isn't working if you need to use pageable with child's side field especially when using #ManyToOne.
For the accepted answer:
You can return new object with static query method, which have to include order by count(b.id)
And also order by bCount isn't working.
Please use #AlanHay solution, it is working, but you can't use primitive field and change foreign key constraint. For instance, change long with Long. Because:
When saving a new entity Hibernate does think a record has to be written to the secondary table with a value of zero. (if you use primitive type)
Otherwise you will get an exception:
Caused by: org.postgresql.util.PSQLException: ERROR: cannot insert into view "....view"
Here is the example:
#Entity
#Table(name = "...")
#SecondaryTable(name = "a_summary_view,
pkJoinColumns = {#PrimaryKeyJoinColumn(name = "id",
referencedColumnName= "id")},
foreignKey = #javax.persistence.ForeignKey(name = "none"))
public class UserEntity {
#Id
private String id;
#NotEmpty
private String password;
#Column(table = "a_summary_view",
name = "b_count")
private Integer bCount;
}

Categories