Please help me with these Hibernate querying issues.
Consider the following structure:
#Entity
class Manager {
#OneToMany
List<Project> projects;
}
0) there are 2 possible ways of dynamic fetching in HQL:
select m from Manager m join m.projects
from Manager m join fetch m.projects
In my setup second one always returns a result of cartesian product with wrong number of objects in a list, while the first one always returns correct number of entities in a list. But the sql queries look the same. Does this mean that "select" clause removes redundant objects from the list in-memory? In this case its strange to see an advice in a book to use select distinct ... to get rid of redundant entities, while "select" does the job. If this is a wrong assumption than why these 2 queries return different results?
If I utilize dynamic fetching by one of the 2 methods above I see a classic n+1 select problem output in my hibernate SQL log. Indeed, FetchMode annotations (subselect or join) do not have power while fetching dynamically. Do I really can't solve the n+1 problem in this particular case?
Looks like Hibernate Criteria API does not support generics. Am I right? Looks like I have to use JPA Criteria API instead?
Is it possible to write HQL query with an entity name parameter inside? For example "from :myEntityParam p where p.id=1" and call setParameter("myEntityParam", MyClass.class) after this. Actually what I want is generic HQL query to replace multiple non-generic dao's by one generic one.
0) I always use a select clause, because it allows telling what you want to select, and is mandatory in JPQL anyway. If you want to select the managers with their projects, use
select distinct m from Manager m left join fetch m.projects
If you don't use the distinct keyword, the list will contain n instances of each manager (n being the number of projects of the manager): Hibernate returns as many elements as there are rows in the result set.
1) If you want to avoid the n + 1 problem, fetch the other association in the same query:
select distinct m from Manager m
left join fetch m.projects
left join fetch m.boss
You may also configure batch fetching to load 10 bosses (for example) at a time when the first boss is accessed. Search for "batch fetching" in the reference doc.
2) The whole Hibernate API is not generified. It's been made on JDK 1.4, before generics. That doesn't mean it isn't useful.
3) No. HQL query parameters are, in the end, prepared statement parameters. You must use String concatenation to do this.
Related
I'm setting up a JPA Specification based repository implementation that utilizes jpa specifications(constructed based on RSQL filter strings) to filter the results, define result ordering and remove any duplicates via "distinct" that would otherwise be returned due to joined tables. The JPA Specification builder method joins several tables and sets the "distinct" flag:
final Join<Object, Object> rootJoinedTags = root.join("tags", JoinType.LEFT);
final Join<Object, Object> rootJoinedLocations = root.join("location", JoinType.LEFT);
...
query.distinct(true);
To allow sorting by joined table columns, I've applied the "HINT_PASS_DISTINCT_THROUGH" hint to the relevant repository method(otherwise, sorting by joined table columns returns an error along the lines of "sort column must be included in the SELECT DISTINCT query").
#QueryHints(value = {
#QueryHint(name = org.hibernate.jpa.QueryHints.HINT_PASS_DISTINCT_THROUGH, value = "false")
})
Page<SomeEntity> findAll(#Nullable Specification<SomeEntity> spec, Pageable pageable);
The arguments for said repository method are constructed as such:
final Sort sort = getSort(searchFilter);
final Specification spec = getSpecificationIfPresent(searchFilter);
final PageRequest pageRequest = PageRequest.of(searchFilter.getPageNumber(), searchFilter.getLimit(), sort);
return eventRepository.findAll(spec, pageRequest);
After those changes, filtering and sorting seem to work as expected. However, the hint seems to cause "distinct" filtering to be applied after the result page is already constructed, thus reducing the number of returned entities in the page from the configured "size" PageRequest argument, to whatever is left after the duplicates are filtered out. For example, if we'd make a PageRequest with "page=0" and "pageSize=10", then the resulting Page may return only 5 "SomeEntity" instances, although the database contains way more entries(177 entities to be exact in this case). If I remove the hint, then the returned entities number is correct again.
Question: is there a way to make the same Specification query setup work with correctly sized Pages(some other hints that might be added to have duplicate filtering performed before the Page object is constructed)? If not, then is there another approach I could use to achieve the required Specification-based filtering, with joined-column sorting and duplicate removal as with "distinct"?
PS: PostgreSQL is the database behind the application in question
The problem you are experimenting have to do with the way you are using the HINT_PASS_DISTINCT_THROUGH hint.
This hint allows you to indicate Hibernate that the DISTINCT keyword should not be used in the SELECT statement issued against the database.
You are taking advantage of this fact to allow your queries to be sorted by a field that is not included in the DISTINCT column list.
But that is not how this hint should be used.
This hint only must be used when you are sure that there will be no difference between applying or not a DISTINCT keyword to the SQL SELECT statement, because the SELECT statement already will fetch all the distinct values per se. The idea is improve the performance of the query avoiding the use of an unnecessary DISTINCT statement.
This is usually what will happen when you use the query.distinct method in you criteria queries, and you are join fetching child relationships. This great article of #VladMihalcea explain how the hint works in detail.
On the other hand, when you use paging, it will set OFFSET and LIMIT - or something similar, depending on the underlying database - in the SQL SELECT statement issued against the database, limiting to a maximum number of results your query.
As stated, if you use the HINT_PASS_DISTINCT_THROUGH hint, the SELECT statement will not contain the DISTINCT keyword and, because of your joins, it could potentially give duplicate records of your main entity. This records will be processed by Hibernate to differentiate duplicates, because you are using query.distinct, and it will in fact remove duplicates if needed. I think this is the reason why you may get less records than requested in your Pageable.
If you remove the hint, as the DISTINCT keyword is passed in the SQL statement which is sent to the database, as far as you only project information of the main entity, it will fetch all the records indicated by LIMIT and this is why it will give you always the requested number of records.
You can try and fetch join your child entities (instead of only join with them). It will eliminate the problem of not being able to use the field you need to sort by in the columns of the DISTINCT keyword and, in addition, you will be able to apply, now legitimately, the hint.
But if you do so it will you another problem: if you use join fetch and pagination, to return the main entities and its collections, Hibernate will no longer apply pagination at database level - it will no include OFFSET or LIMIT keywords in the SQL statement, and it will try to paginate the results in memory. This is the famous Hibernate HHH000104 warning:
HHH000104: firstResult/maxResults specified with collection fetch; applying in memory!
#VladMihalcea explain that in great detail in the last part of this article.
He also proposed one possible solution to your problem, Window Functions.
In you use case, instead of using Specifications, the idea is that you implement your own DAO. This DAO only need to have access to the EntityManager, which is not a great deal as you can inject your #PersistenceContext:
#PersistenceContext
protected EntityManager em;
Once you have this EntityManager, you can create native queries and use window functions to build, based on the provided Pageable information, the right SQL statement that will be issued against the database. This will give you a lot of more freedom about what fields use for sorting or whatever you need.
As the last cited article indicates, Window Functions is a feature supported by all mayor databases.
In the case of PostgreSQL, you can easily come across them in the official documentation.
Finally, one more option, suggested in fact by #nickshoe, and explained in great detail in the article he cited, is to perform the sorting and paging process in two phases: in the first phase, you need to create a query that will reference your child entities and in which you will apply paging and sorting. This query will allow you to identify the ids of the main entities that will be used, in the second phase of the process, to obtain the main entities themselves.
You can take advantage of the aforementioned custom DAO to accomplish this process.
It may be an off-topic answer, but it may help you.
You could try to tackle this problem (pagination of parent-child entities) by separating the query in two parts:
a query for retrieving the ids that match the given criteria
a query for retrieving the actual entities by the resulting ids of the previous query
I came across this solution in this blog post: https://vladmihalcea.com/fix-hibernate-hhh000104-entity-fetch-pagination-warning-message/
Situation: "Parent" entity has multiple "Child" entities (#OneToMany, #Lazy) - two way relationship. No foreign key ("Child#parentId") field on entity.
Goal: Avoid N+1 problem by retrieving fully loaded Parent collection using sub-selects. If I understand theory of Subselect, this is my goal (2 resulting SQL queries):
select * from Parent ...;
select * from Child where parent_id in ...;
Question 1: What is the best practice to achieve this? Could you provide examples in both JPQL/HSQL and Criteria?
Question 2 (bonus): Can API manage second query division into "batches" - e.g. limit batches to 500: if 1st query loads 1000 Parents, 2a. loads Children for 500 Parents, 2b. loads for next 500.
I have tried:
Both result in SQL JOINs, it seems that I cannot use Child's foreign key without JOIN.
// 2nd query:
criteria
.createAlias("parent", "p")
.add(Property.forName("p.id")
.in(parentCriteria.setProjection(Projections.property("id"))))
.list();
// 2nd query (manual):
criteria
.createAlias("parent", "p")
.add(Property.forName("p.id").in(parentIdList))
.list();
Update (2015-04-05)
I checked that what it indeed works in EclipseLink via hints:
query.setHint("eclipselink.batch.type", "EXISTS");
This link http://blog.ringerc.id.au/2012/06/jpa2-is-very-inflexible-with-eagerlazy.html suggests that this is not possible via Hibernate and suggests manual fetching. However I cannot understand how to achieve it via HQL or Criteria, specifically how to get child.parent_id column that is not on Entity, but exists only on Database. That is, avoiding JOIN that would result from child.parent.id.
To avoid N+1 queries you can annotate relationship with
#BatchFetch(BatchFetchType.JOIN) //in eclipselink or
#BatchSize //in hibernate.
Inside queries, you can add fetch to join clause:
select p from Parent p join fetch p.children c where ...
You can add also query hints
query.setHint("eclipselink.batch", "p.children");
Or use EntityGraphs.
I have the following entities:
Machines {
id,
name
}
Favorites {
userId,
objectId,
objectType
}
Now I want to return list of machines ordered by favorites, name.
Favorites does not have any relation with Machines entities, its a generic entity which can hold various favoritable objects.
I got the sorting to work by the following raw sql. Is there any way to get this working using hibernate criterias. Basically ability to add alias for Favorites though Machines doesn't have any reference to it.
select m.* from Machines m left outer join Favorites f on m.id=f.objectId and f.userId =#userId order by f.userId desc, m.name asc
There are two ways to do this:
Use an SQL query or an SQL restriction for a criteria query.
Map the relation between Favorites and Machines as an Any type association.
Since you already have the query, executing it as a native SQL query through hibernate is the simplest solution:
sess
.createSQLQuery("SELECT ... ")
.addEntity(Machine.class)
.list();
The Any type association however, will probably benefit you in other situations where you want to query favoritables.
As said by #Maarten Winkels, you can use native SQL query that you have. You can do that but, if you want change your database then syntax may differs. So, it is recommended not to use native SQL query.
You cannot perform outer join using Hibernate criteria without any association between tables.
If you change your mind & want to add an association between these tables, then You can do something like below using criteria
List machines = session.createCriteria( Machines.class )
.createAlias("Machines", "m")
.createAlias("Favorites", "f", Criteria.LEFT_JOIN,
Restrictions.and(Restrictions.eq("m.id", "f.objectId"),
Restrictions.eq("f.userId", "#userId")))
.addOrder(Order.desc("f.userId"))
.addOrder(Order.asc("m.name"))
.list();
If for some reason you don't want to add any relationship between these tables, then you can use INNER JOIN or CROSS JOIN to get all machines with whatever criteria you want.
from Machines m, Favorites f where m.id = f.objectId and f.userId = #userId order by f.userId desc, m.name asc;
With inner joins there is one problem, if there is outer join relation in database then it doesn't work.
You can also write subqueries or separate queries & perform manual conditional checks between the results returned by those queries.
And one thing I want to ask you, What is the meaning of #userId in your SQL query ? I keep it as it is in my answer but, I didn't understand for what purpose # is there ?
Does any one have a clue why DISTINCT keyword is removed from the query when using DataNucleus (that's the software the company I work for uses)? I was able to debug the code and verified that the keyword is actually with the query. But by the time this function is called within JPAEntityManager
createQuery(CriteriaQuery<T> criteriaQuery)
the DISTINCT keyword is removed. Debugging showed me it has something to do with
criteria.getCompilation(ec.getMetaDataManager(), ec.getClassLoaderResolver());
function call. Somehow createQuery() function works fine with SELECT COUNT(DISTINCT DN_THIS) but not with SELECT DISTINCT FROM.
I hope some of you have at least a slight idea where the problem is since I'm fairly new with JPA and SQL queries in general that I can't find a quick solution on my own.
The query I'm trying to perform is as follows:
SELECT DISTINCT DN_THIS FROM Hop DN_THIS JOIN DN_THIS.tags t WHERE ((DN_THIS.entityStatus <> 'DELETED') AND ((t.name = :DN_PARAM_4) OR (t.name = :DN_PARAM_5))) ORDER BY DN_THIS.name ASC
Thank you!
I prompted a developer of DataNucleus, who seems to have already fixed the use of DISTINCT in criteria which was simply not being passed across to the generated query.
I am attempting to create a domain/entity class based on a complex query. The query unions a bunch of tables together and unfortunately I am not able to create a view on the database for this query. I have been trying to set up the entity object but I am unsure of how to ensure that the marshaling works properly (and ensure the entity acts as read-only object).
As an example of the query, I am doing something like this:
Select
U_T.a,
U_T.b,
U_T.c,
C_T.a
FROM
(select
A_T.a,
null as b,
A_T.c,
1 as ind
from A_T
UNION
select
B_T.a,
B_T.b,
null,
0 as ind
FROM B_T
) U_T
left outer join C_T on C_T.fk_a = U_T.a;
The other issues are that this union can result in instances where there is no unique key column. This is fine as this data is for viewing only, and never editing. However the #Entity annotation wants a column to be listed with the #ID annotation. Another issue is that I do not believe I can use the other entity classes as the goal is to reduce the number of DB transactions from this query (as the actual one can result in hundreds of recursive queries being performed).
If I need to give any more information please let me know.