Avoid N+1 select with native sqlQuery? - java

Here's what I have :
Entity A -> oneToMany -> Entity B -> manyToOne -> Entity C
And because I have to do an inner join without foreign keys between A and another entity X, I have to use createSqlQuery and not createQuery. (obviously I can't change the database)
So, all I was able to do is a nice 2N+1 select. (with fetch=EAGER or by hand, it's the same).
Does someone have any idea?
EDIT: with a #BatchSize I reduced the number of selects from A to B. I have now a N+2 select.
EDIT 2: I can't use the inner join (with the comma) because the database is an old DB2, and it crashes.

To avoid N+1, you can use the following code in your map field
#Fetch(FetchMode.JOIN)
Hope this will help.

Sorry for the vague answer, I really never experienced this. I would try to approach this problem using ResultTransformers:
http://docs.jboss.org/hibernate/core/3.6/javadocs/org/hibernate/transform/ResultTransformer.html
Unfortunately, there's little documentation about it, so, your best option is to look at the test suite and see how it's used.

You can use something like this, but I'm not sure how would it work with complex query:
s.createSQLQuery(
"SELECT {a.*}, {b.*}, {c.*} " +
"FROM X x JOIN A a ON ... JOIN B b ON ... JOIN C c ON ...")
.addEntity(A.class, "a")
.addJoin(B.class, "a.b")
.addJoin(C.class, "a.b.c")
See also:
18.1.3. Handling associations and collections

Related

JOOQ: Dynamic join conditions

I would like to create conditions from this select in JOOQ. Because in my case I want to declare a dynamic query and check if TABLE_C.FIELDC contains "foo" only when I need...
Example:
create.select().from(TABLE_A).join(TABLE_B).onKey(Keys.FK_TABLEA_TABLEB)
.join(TABLE_C).onKey(Keys.FK_TABLEB_TABLEC)
.where(TABLE_C.FIELDC.containsIgnoreCase("foo");
to:
SelectFinalStep select = create.select().from(TABLEA);
if (isFooSearched) {
query.addCondition( <JOIN> and <CONTAINS> like first example)
}
How can I do this?
There are several ways to solve this:
Using implicit joins
In relatively simple cases, when the optional join follows a to-one relationship, you may be able to use an implicit join (if you're using the code generator):
create.select()
.from(TABLE_A)
.join(TABLE_B).onKey(Keys.FK_TABLEA_TABLEB)
.where(isFooSearched
? TABLE_B.tableC().FIELDC.containsIgnoreCase("foo")
: noCondition())
.fetch();
Using SEMI JOIN instead of INNER JOIN, which makes dynamic SQL much easier
create.select()
.from(TABLE_A)
.where(
isFooSearched
? TABLE_A.TABLE_B_ID.in(
select(TABLE_B.ID)
.from(TABLE_B)
.join(TABLE_C).onKey(FK_TABLEB_TABLEC)
.where(TABLE_C.FIELDC.containsIgnoreCase("foo"))
)
: trueCondition())
.fetch();
Note that a semi join is also more formally correct in this case than an inner join, as you will not get any duplicate rows on TABLE_A for any matches in to-many relationships (removing them with DISTINCT might be wrong and certainly is inefficient).
Side-note: Not all databases recognise semi-joins in EXISTS or IN syntax, and may thus not optimally run this statement, compared to a JOIN based solution.
Using INNER JOIN as you asked for
// I'm assuming DISTINCT is required here, as you
// - are not interested in TABLE_B and TABLE_C results (semi join semantics)
// - do not want any duplicate TABLE_A values
create.selectDistinct(TABLE_A.fields())
.from(
isFooSearched
? TABLE_A
.join(TABLE_B).onKey(FK_TABLEA_TABLEB)
.join(TABLE_C).onKey(FK_TABLEB_TABLEC)
)
: TABLE_A)
.where(
isFooSearched
? TABLE_C.FIELDC.containsIgnoreCase("foo")
: trueCondition())
.fetch();
I've mad a few assumptions here, including the fact that DISTINCT usage could be correct on your joined variant of the query, but it is hurting you (probably) on your "default" query variant, so perhaps, shoe horning this into a single dynamic query might be overkill.
Thus...
Using two different queries
For my taste, the two queries are simple enough to allow for some duplication and simply run two different queries depending on the flag:
if (isFooSearched)
create.select().from(TABLE_A) /* joins or semi joins here */ .fetch();
else
create.select().from(TABLE_A).fetch();
Side note
All solutions are assuming you have these static imports in your code:
import static org.jooq.impl.DSL.*;

SELECT e From Employee e -- why a redundant "e"?

Sample query:
SELECT e FROM Employee e WHERE SUBSTRING(e.name, 3) = 'Mac'
In this syntax, it seems intuitive to say SELECT e, that e is now declared or defined(?). However, isn't the second e: FROM Employee e redundant?
This is a throwback or similarity to SQL SELECT syntax?
The second e is a identification variable. It actually defines e by telling the JPQL parser that you are using e somewhere else in your query, and that it refers to the Employee entity. The first occurrence of e is where you usa that e.
So, it's not redundant. If you leave out the first one, the JPQL parser doesn't know what to select. If you leave it out the second time, you're selecting something that the JPQL parser doesn't know.
JPQL syntax is a little different with normal SQL syntax.According to your sample,first e is represent * of normal SQL. So it is not redundant. But you use JPA 2.x, using criteria query is more better than JPQL
I came to this question with the same concerns as you are. I noticed that when you select * the results comes in columns, and when you select e there's only one column with some kind of serialized entities as results. So I found a pretty good explanation about this at http://www.thejavageek.com/2014/03/17/jpa-select-clause/
SELECT e FROM Employee e
This is quite similar to SQL, the difference is:
This query does not return a set of records from columns, instead it
returns an entity.
This entity is aliased to e , it is called as
identification variable.
This identification variable is tied to
Employee type and it means result will be an entity of Employee
type.
IMO there is no need to use the second e in this case (yes it is redundant) if you have single query like this. It makes sense when you join two tables and if those two tables have common column names then you would select each column using that table alias 'e'
use "e." (which columns you have)
SELECT e.name, e.e FROM Employee e WHERE SUBSTRING(e.name, 3) = 'Mac'
DISCLAIMER: This only works with Hibernate as JPA implementation since this information corresponds to HQL.
In this simple case you don't have to use the e (which is nothing more than an alias). Since you are selecting the complete entity, you even don't have to write the select e part. So you can write:
FROM Employee WHERE SUBSTRING(name, 3) = 'Mac'
Explanation:
In the from part, you specify for which Entities you are looking for. The e behind the Employee is just an alias for Employee which you can use to address the whole object (the select part) or attributes from it. In simple queries you don't need it, but as soon you have a join in your query, it's always a good idea to use an alias.
The select part of the query is for selecting which attributes of an entity you want to get back. If you omit the ´selectpart or just specify the alias (e` in this case), JPA gives back the whole Entity. In SQL this does usually not work (at least for Oracle).
To answer your question in the comment: You can use the alias e in the select part of the query. But in Order to do so, you must teach JPA what this e refers to. And this is what the from Employee e part is doing.

A set of questions on Hibernate quering

Please help me with these Hibernate querying issues.
Consider the following structure:
#Entity
class Manager {
#OneToMany
List<Project> projects;
}
0) there are 2 possible ways of dynamic fetching in HQL:
select m from Manager m join m.projects
from Manager m join fetch m.projects
In my setup second one always returns a result of cartesian product with wrong number of objects in a list, while the first one always returns correct number of entities in a list. But the sql queries look the same. Does this mean that "select" clause removes redundant objects from the list in-memory? In this case its strange to see an advice in a book to use select distinct ... to get rid of redundant entities, while "select" does the job. If this is a wrong assumption than why these 2 queries return different results?
If I utilize dynamic fetching by one of the 2 methods above I see a classic n+1 select problem output in my hibernate SQL log. Indeed, FetchMode annotations (subselect or join) do not have power while fetching dynamically. Do I really can't solve the n+1 problem in this particular case?
Looks like Hibernate Criteria API does not support generics. Am I right? Looks like I have to use JPA Criteria API instead?
Is it possible to write HQL query with an entity name parameter inside? For example "from :myEntityParam p where p.id=1" and call setParameter("myEntityParam", MyClass.class) after this. Actually what I want is generic HQL query to replace multiple non-generic dao's by one generic one.
0) I always use a select clause, because it allows telling what you want to select, and is mandatory in JPQL anyway. If you want to select the managers with their projects, use
select distinct m from Manager m left join fetch m.projects
If you don't use the distinct keyword, the list will contain n instances of each manager (n being the number of projects of the manager): Hibernate returns as many elements as there are rows in the result set.
1) If you want to avoid the n + 1 problem, fetch the other association in the same query:
select distinct m from Manager m
left join fetch m.projects
left join fetch m.boss
You may also configure batch fetching to load 10 bosses (for example) at a time when the first boss is accessed. Search for "batch fetching" in the reference doc.
2) The whole Hibernate API is not generified. It's been made on JDK 1.4, before generics. That doesn't mean it isn't useful.
3) No. HQL query parameters are, in the end, prepared statement parameters. You must use String concatenation to do this.

Multiple #ManyToMany sets from one join table

I'm mapping a proprietary database to Hibernate for use with Spring. In it, there are a couple of jointables that, for entity A and entity B have the following schema:
CREATE TABLE AjoinB (
idA int not null,
idB int not null,
groupEnum enum ('groupC', 'groupD', 'groupE'),
primary key(idA, idB, groupEnum)
);
As you can see, this indicates that there can be multiple A-B relationships that put them in different groups. I'd like to end up with, first line for entity A and second for entity B, the following sets
Set<B> BforGroupC, BforGroupD, BforGroupE;
Set<A> AforGroupC, AforGroupD, AforGroupE;
So far, I've only managed to put them in one set and disregard the groupEnum relationship attribute:
#ManyToMany(targetEntity=B.class, cascade={ CascadeType.PERSIST, CascadeType.MERGE } )
#JoinTable(name="AjoinB", joinColumns=#JoinColumn(name="idA"), inverseJoinColumns=#JoinColumn(name="idB") )
private Set<B> BforAllGroups;
and
#ManyToMany( mappedBy = "BforAllGroups", targetEntity = A.class )
private Set<A> AforAllGroups;
How can I make multiple sets where they belong either in groupC, groupD or groupE?
Cheers
Nik
If you're considering doing this, don't. Tables are cheap nowadays what's with the economy and all, so just create one per association; it'll be so much easier.
If you're bound by a legacy database and you can't change the structure of that table I would
Consider skaffman's solution first (+1, btw). Depending on your target database you may be able to write a trigger for your views that would insert adequate "discriminator" value.
If the above isn't possible in your DB, another solution is to use custom SQL for CRUD operations for your collections. Keep in mind that this will NOT work (e.g. your "discriminator value" won't get applied) for complex HQL queries involving your association as part of condition. You can also mix / match this with above - e.g. use views and use custom SQL for insert / delete.
If both of the above fail, go with "association as a separate entity" as suggested by framer8. That's going to be rather ugly (since we're assuming here you can't change your tables) due to composite keys and all extraneous code. It may, in fact, be impossible if any of your associations allows duplicates.
To my knowledge, Hibernate cannot use such a "discriminator" column in the way that you want. Hibernate requires a join table for each of them.
Perhaps you might be able to define additional views on the table, showing each of the groupings?
I think the advise anytime you need to access a field in a link table is to make the link table an object and a hibernate entity in its own right. A would have a set of AtoB objects and AtoB would have a set of B objects. I have a simmilar situation where the link table has a user associated with the link.
select joinTable.b from A a
left join a.AtoB joinTable
where joinTable.group = 'C'
It's not as elegant as having an implicit join done by hibernate, but it does give you the control you need.

OneToOne relationship with shared primary key generates n+1 selects; any workaround?

Imagine 2 tables in a relational database, e.g. Person and Billing. There is a (non-mandatory) OneToOne association defined between these entities, and they share the Person primary key (i.e. PERSON_ID is defined in both Person and Billing, and it is a foreign key in the latter).
When doing a select on Person via a named query such as:
from Person p where p.id = :id
Hibernate/JPA generates two select queries, one on the Person table and another on the Billing table.
The example above is very simple and would not cause any performance issues, given the query returns only one result. Now, imagine that Person has n OneToOne relationships (all non-mandatory) with other entities (all sharing the Person primary key).
Correct me if I'm wrong, but running a select query on Person, returning r rows, would result in (n+1)*r selects being generated by Hibernate, even if the associations are lazy.
Is there a workaround for this potential performance disaster (other than not using a shared primary key at all)? Thank you for all your ideas.
Imagine 2 tables in a relational database, e.g. Person and Billing. There is a (non-mandatory) OneToOne association defined between these entities,
Lazy fetching is conceptually not possible for non-mandatory OneToOne by default, Hibernate has to hit the database to know if the association is null or not. More details from this old wiki page:
Some explanations on lazy loading (one-to-one)
[...]
Now consider our class B has
one-to-one association to C
class B {
private C cee;
public C getCee() {
return cee;
}
public void setCee(C cee) {
this.cee = cee;
}
}
class C {
// Not important really
}
Right after loading B, you may call
getCee() to obtain C. But look,
getCee() is a method of YOUR class
and Hibernate has no control over it.
Hibernate does not know when someone
is going to call getCee(). That
means Hibernate must put an
appropriate value into "cee"
property at the moment it loads B from
database. If proxy is enabled for
C, Hibernate can put a C-proxy
object which is not loaded yet, but
will be loaded when someone uses it.
This gives lazy loading for
one-to-one.
But now imagine your B object may or
may not have associated C
(constrained="false"). What should
getCee() return when specific B
does not have C? Null. But remember,
Hibernate must set correct value of
"cee" at the moment it set B
(because it does no know when someone
will call getCee()). Proxy does not
help here because proxy itself in
already non-null object.
So the resume: if your B->C mapping
is mandatory (constrained=true),
Hibernate will use proxy for C
resulting in lazy initialization. But
if you allow B without C, Hibernate
just HAS TO check presence of C at the
moment it loads B. But a SELECT to
check presence is just inefficient
because the same SELECT may not just
check presence, but load entire
object. So lazy loading goes away.
So, not possible... by default.
Is there a workaround for this potential performance disaster (other than not using a shared primary key at all)? Thank you for all your ideas.
The problem is not the shared primary key, with or without shared primary key, you'll get it, the problem is the nullable OneToOne.
First option: use bytecode instrumentation (see references to the documentation below) and no-proxy fetching:
#OneToOne( fetch = FetchType.LAZY )
#org.hibernate.annotations.LazyToOne(org.hibernate.annotations.LazyToOneOption.NO_PROXY)
Second option: Use a fake ManyToOne(fetch=FetchType.LAZY). That's probably the most simple solution (and to my knowledge, the recommended one). But I didn't test this with a shared PK though.
Third option: Eager load the Billing using a join fetch.
Related question
Making a OneToOne-relation lazy
References
Hibernate Reference Guide
19.1.3. Single-ended association proxies
19.1.7. Using lazy property fetching
Old Hibernate FAQ
How do I set up a 1-to-1 relationship as lazy?
Hibernate Wiki
Some explanations on lazy loading (one-to-one)
This is a common performance issue with Hibernate (just search for "Hibernate n+1"). There are three options to avoiding n+1 queries:
Batch size
Subselect
Do a LEFT JOIN in your query
These are covered in the Hibernate FAQs here and here
Stay away from hibernate's OneToOne mapping
It is very broken and dangerous. You are one minor bug away from a database corruption problem.
http://opensource.atlassian.com/projects/hibernate/browse/HHH-2128
You could try "blind-guess optimization", which is good for "n+1 select problems".
Annotate you field (or getter) like this:
#org.hibernate.annotations.BatchSize(size = 10)
java.util.Set<Billing> bills = new HashSet<Billing>();
That "n+1" problem will only occur if you specify the relationship as as lazy or you explicitly indicate that you want hibernate to run a separate query.
Hibernate can fetch the relationship to Billing with an outer join on the select of Person, obviating the n+1 problem altogether. I think it is the fetch="XXX" indication in your hbm files.
Check out A Short Primer On Fetching Strategies
use optional =true with a one-to-one relationship like this to avoid the n+1 issue
#OneToOne(fetch = FetchType.LAZY, optional=true)
#PrimaryKeyJoinColumn

Categories