JPA/Hibernate query: load eagerly with subselect - java

Situation: "Parent" entity has multiple "Child" entities (#OneToMany, #Lazy) - two way relationship. No foreign key ("Child#parentId") field on entity.
Goal: Avoid N+1 problem by retrieving fully loaded Parent collection using sub-selects. If I understand theory of Subselect, this is my goal (2 resulting SQL queries):
select * from Parent ...;
select * from Child where parent_id in ...;
Question 1: What is the best practice to achieve this? Could you provide examples in both JPQL/HSQL and Criteria?
Question 2 (bonus): Can API manage second query division into "batches" - e.g. limit batches to 500: if 1st query loads 1000 Parents, 2a. loads Children for 500 Parents, 2b. loads for next 500.
I have tried:
Both result in SQL JOINs, it seems that I cannot use Child's foreign key without JOIN.
// 2nd query:
criteria
.createAlias("parent", "p")
.add(Property.forName("p.id")
.in(parentCriteria.setProjection(Projections.property("id"))))
.list();
// 2nd query (manual):
criteria
.createAlias("parent", "p")
.add(Property.forName("p.id").in(parentIdList))
.list();
Update (2015-04-05)
I checked that what it indeed works in EclipseLink via hints:
query.setHint("eclipselink.batch.type", "EXISTS");
This link http://blog.ringerc.id.au/2012/06/jpa2-is-very-inflexible-with-eagerlazy.html suggests that this is not possible via Hibernate and suggests manual fetching. However I cannot understand how to achieve it via HQL or Criteria, specifically how to get child.parent_id column that is not on Entity, but exists only on Database. That is, avoiding JOIN that would result from child.parent.id.

To avoid N+1 queries you can annotate relationship with
#BatchFetch(BatchFetchType.JOIN) //in eclipselink or
#BatchSize //in hibernate.
Inside queries, you can add fetch to join clause:
select p from Parent p join fetch p.children c where ...
You can add also query hints
query.setHint("eclipselink.batch", "p.children");
Or use EntityGraphs.

Related

Criteria Builder JOIN on unreferenced table

Is it possible to perform a join with CriteriaBuilder on a table that is not referenced by the selected entity? Since CriteriaBuilder.join() expects as parameter the attribute name, it seems like it won't work.
To be a bit clearer, the original query looks like this:
select Vehicle v left join VehicleStatus vs on v.id = vs.vehicleId...
Vehicle does not define a relationship to VehicleStatus. And changes to the database are currently undesired though possible if needed.
Currently the code I have
final Join<Vehicle, VehicleStatus> vs = vehicle.join("vs", JoinType.LEFT);
vs.on(cb.equal(vs.get("vehicleId"), vehicle.get("id")));
fails with java.lang.IllegalArgumentException: Unable to locate Attribute with the the given name [vs] on this ManagedType
No, you need a mapped association to create a join clause with the CriteriaBuilder.
With Hibernate, you can join 2 unassociated entities in a JPQL query. The syntax is almost identical to SQL. But it's a Hibernate-specific feature and not part of the JPA standard.

Jpa + Hibernate join fetch returning inconsistent data

We are executing via JPA + Hibernate 4 a query which is returning some inconsistent data.
We have one "parent" table:
PARENT
id *
req_num
active
creation_date
and one "child" table:
CHILD
id *
type
name
email
One parent could have many childs, and this is mapped into database using another table:
PARENT_CHILDS
parent_id (FK to PARENT)
child_id (FK to CHILD)
child_order
In Java, our Parent class has a #OneToMany annotated List named childs. Both of them are annotated with #Entity.
We're using org.hibernate.cfg.ImprovedNamingStrategy as our naming strategy.
The query we're executing is:
select parent from Parent parent join fetch parent.childs child where child.type IN ('01', '02') and child.email = 'mail#mail.com' and parent.active = 1 and parent.reqNum != 'testReqNum'
This is translated to the next plain SQL query (we're seeing this using show_sql=true property):
select parent0_.id as id2_, parent0_.active as active2_, parent0_.creation_date as creation2_
from parent parent0_
inner join parent_childs childs1_ on parent0_.id=childs1_.Application_id
inner join child child2_ on childs1_.child_id=child2_.id
where (child2_.type in (? , ?)) and child2_.email=? and parent0_.active=? and parent0_.req_num<>?
Our parent table has only 2 parents which satisfy the condition "parent0_.active=? and parent0_.req_num<>?". Each of them, has two childs. And only one of their childs satisfies the condition "(child2_.type in (? , ?)) and child2_.email=?".
So, when we execute the SQL query directly to our Oracle database, it returns only 2 rows (the 2 parents with only one child each).
However, in Java we are recovering some weird results, which varies if we use "inner join", "join" or "join fetch". For instance, we're receiving a list of three parents. One with one child, another one with the other child which is not satisfying the mail condition, and a last one with its both childs.
We're wondering why are we experiencing this behaviour, and, more importantly, how could we solve it?
Thanks. Kind regards.
As I mentioned in my comment, a many to many relationship is not necessary to implement your solution. Additionally, your link table is not properly a link table because it has its own id, so must be considered as another entity.
If you can't change the DB schema, you could implement relationships among your entities this way:
a OneToMany relationship from Parent to ParentChild
a OneToOne relationship from ParentChild to Child

criteria query on entities not having joins

Is there any way to write criteria query on entities not having explicit joins ? By explicit join I mean that 2 tables in database have no foreign key relationship but some columns need to be fetched from both the tables so joins are required in query. I know that queries having join can be written with 'in' clause and criteria queries can be written with "In" criteria. I have written HQL for this case but please tell me how to write criteria query for this case.
Thanks in advance
In this case, the cross join would be solution, but that is possible ONLY with HQL. Check doc (small cite):
16.2. The from clause
Multiple classes can appear, resulting in a cartesian product or "cross" join.
from Formula, Parameter
from Formula as form, Parameter as param
And, also, we can filter on any of these two Entities inside of the WHERE clause, to narrow the cartesian product...

A set of questions on Hibernate quering

Please help me with these Hibernate querying issues.
Consider the following structure:
#Entity
class Manager {
#OneToMany
List<Project> projects;
}
0) there are 2 possible ways of dynamic fetching in HQL:
select m from Manager m join m.projects
from Manager m join fetch m.projects
In my setup second one always returns a result of cartesian product with wrong number of objects in a list, while the first one always returns correct number of entities in a list. But the sql queries look the same. Does this mean that "select" clause removes redundant objects from the list in-memory? In this case its strange to see an advice in a book to use select distinct ... to get rid of redundant entities, while "select" does the job. If this is a wrong assumption than why these 2 queries return different results?
If I utilize dynamic fetching by one of the 2 methods above I see a classic n+1 select problem output in my hibernate SQL log. Indeed, FetchMode annotations (subselect or join) do not have power while fetching dynamically. Do I really can't solve the n+1 problem in this particular case?
Looks like Hibernate Criteria API does not support generics. Am I right? Looks like I have to use JPA Criteria API instead?
Is it possible to write HQL query with an entity name parameter inside? For example "from :myEntityParam p where p.id=1" and call setParameter("myEntityParam", MyClass.class) after this. Actually what I want is generic HQL query to replace multiple non-generic dao's by one generic one.
0) I always use a select clause, because it allows telling what you want to select, and is mandatory in JPQL anyway. If you want to select the managers with their projects, use
select distinct m from Manager m left join fetch m.projects
If you don't use the distinct keyword, the list will contain n instances of each manager (n being the number of projects of the manager): Hibernate returns as many elements as there are rows in the result set.
1) If you want to avoid the n + 1 problem, fetch the other association in the same query:
select distinct m from Manager m
left join fetch m.projects
left join fetch m.boss
You may also configure batch fetching to load 10 bosses (for example) at a time when the first boss is accessed. Search for "batch fetching" in the reference doc.
2) The whole Hibernate API is not generified. It's been made on JDK 1.4, before generics. That doesn't mean it isn't useful.
3) No. HQL query parameters are, in the end, prepared statement parameters. You must use String concatenation to do this.

Hibernate uses initial WHERE clause in subsequent queries

In using Hibernate's JPA implementation, I noticed an interesting optimization behavior. Within the same transaction, the initial JPA query's WHERE clause is used for subsequent queries involving the results of the initial query.
For example, person has lastName and a set of owned books.
// (1) get person by last name
Query q = entityManager.createQuery("SELECT p FROM Person p WHERE p.firstName = :lastName");
q.setParameter("lastName", "Smith");
List<Person> persons = q.getResultList();
// (2) get books owned by some arbitrary person in persons
Person person = persons.get(n);
Collection<Book> books = person.books;
(1) translates to the SQL:
SELECT ... FROM Person WHERE lastName = 'Smith'
When (2) is run and accesses Person's books, it generates the SQL:
SELECT ... FROM Person_Book book0_ WHERE book0_.personId IN (SELECT ... FROM ... WHERE lastName = 'Smith')
Somehow the WHERE clause from (1) is remembered and used in subsequent queries (2) involving the retrieved person. What is this behavior in Hibernate called and how can I configure it?
Follow up: I'm using subselect on person's books. This explains the behavior that I'm seeing.
Extracted from this link:
The last form of fetching I want to cover is subselect fetching. Subselect fetching is very similar to batch size controlled fetching, which I just described, but takes the 'numerical complications' out of the equation. Subselect fetching is actually a different type of fetching strategy that is applied to collection style associations. Unlike join style fetching, however, subselect fetching is still compatible with lazy associations. The difference is that subselect fetching just gets "the whole shootin' match" as a co-worker of mine would say, rather than just a batch. In other words, it uses subselect execution to pass the ID set of the main entity set into the select off of the association table:
select * from owner
select * from pet where owner_id in (select id from owner)

Categories