What method of joining entities is faster? - java

Here are two queries retrieving equivalent data:
SELECT DISTINCT packStatus
FROM PackStatus packStatus JOIN FETCH packStatus.vars, ProcessEntity requestingProcess
WHERE
packStatus.status.code='OGVrquestExec'
AND packStatus.procId=requestingProcess.activityRequestersProcessId
AND requestingProcess.id='1000323733_GU_OGVProc'
SELECT DISTINCT packStatus
FROM PackStatus packStatus JOIN FETCH packStatus.vars
WHERE
packStatus.status.code='OGVrquestExec'
AND packStatus.procId=(SELECT requestingProcess.activityRequestersProcessId FROM ProcessEntity requestingProcess WHERE requestingProcess.id='1000323733_GU_OGVProc')
These queries differ in the method how requstingProcess is joined with packStatus. In general, which of these two methods is more preferable in terms of performance? I'm using JPA 1.2 provided by Hibernate 3.3 on Postgres 8.4.
UPD: I've replaced fake queries with real queries from my app. Here is SQL generated by Hibernate for first and second query. Links to query plans: first, second. Query plans look pretty the same. The only difference is what moment data from bpms_process table is aggregated to query result at. But I don't know is it right to generalize these results? Would query plans be almost the same for queries differing only in joining method? Is it possible to get a big difference in query cost by changing joining method?

Use EXPLAIN ANALYZE and see.
I won't be surprised if they get turned into the same query plan.
See: https://stackoverflow.com/tags/postgresql-performance/info

Related

Spring JPA - Reading data along with relations - performance improvement

I am reading data from a table using Spring JPA.
This Entity object has one-to-many relationship to other six tables.
All tables together has 20,000 records in them.
I am using below query to fetch data from DB.
SELECT * FROM A WHER ID IN (SELECT ID FROM B WHERE COL1 = '?')
A table has relationship to other 6 tables.
Spring JPA is taking around 30 seconds of time to read this data from DB.
Any idea to improve the data fetch time here.
I am using native Queries here and i am looking for query rewriting that will optimize the data fetch time.
Please suggest thanks.
You might need consider below to identify the root cause:
Check if you are ending up with n+1 query issue. Your query might end up calling n queries for each join table, where n is no. of associations with the join table. You can check this by setting spring.jpa.show-sql=true
If you see the issue as n+1 then you need set appropriate FetchMode, refer https://www.baeldung.com/hibernate-fetchmode for detailed explanation of using different FetchModes.
If it is not n+1 query issue you might need to check the performance of the genarated queries using EXPLAIN command. Usually IN clause on a non indexed columns have performance impact.
So set spring.jpa.show-sql=true and check queries generated and run to debug and optimize your code or query.

Is there a need to use hibernate Criteria? [duplicate]

What are the pros and cons of using Criteria or HQL? The Criteria API is a nice object-oriented way to express queries in Hibernate, but sometimes Criteria Queries are more difficult to understand/build than HQL.
When do you use Criteria and when HQL? What do you prefer in which use cases? Or is it just a matter of taste?
I mostly prefer Criteria Queries for dynamic queries. For example it is much easier to add some ordering dynamically or leave some parts (e.g. restrictions) out depending on some parameter.
On the other hand I'm using HQL for static and complex queries, because it's much easier to understand/read HQL. Also, HQL is a bit more powerful, I think, e.g. for different join types.
There is a difference in terms of performance between HQL and criteriaQuery, everytime you fire a query using criteriaQuery, it creates a new alias for the table name which does not reflect in the last queried cache for any DB. This leads to an overhead of compiling the generated SQL, taking more time to execute.
Regarding fetching strategies [http://www.hibernate.org/315.html]
Criteria respects the laziness settings in your mappings and guarantees that what you want loaded is loaded. This means one Criteria query might result in several SQL immediate SELECT statements to fetch the subgraph with all non-lazy mapped associations and collections. If you want to change the "how" and even the "what", use setFetchMode() to enable or disable outer join fetching for a particular collection or association. Criteria queries also completely respect the fetching strategy (join vs select vs subselect).
HQL respects the laziness settings in your mappings and guarantees that what you want loaded is loaded. This means one HQL query might result in several SQL immediate SELECT statements to fetch the subgraph with all non-lazy mapped associations and collections. If you want to change the "how" and even the "what", use LEFT JOIN FETCH to enable outer-join fetching for a particular collection or nullable many-to-one or one-to-one association, or JOIN FETCH to enable inner join fetching for a non-nullable many-to-one or one-to-one association. HQL queries do not respect any fetch="join" defined in the mapping document.
Criteria is an object-oriented API, while HQL means string concatenation. That means all of the benefits of object-orientedness apply:
All else being equal, the OO version is somewhat less prone to error. Any old string could get appended into the HQL query, whereas only valid Criteria objects can make it into a Criteria tree. Effectively, the Criteria classes are more constrained.
With auto-complete, the OO is more discoverable (and thus easier to use, for me at least). You don't necessarily need to remember which parts of the query go where; the IDE can help you
You also don't need to remember the particulars of the syntax (like which symbols go where). All you need to know is how to call methods and create objects.
Since HQL is very much like SQL (which most devs know very well already) then these "don't have to remember" arguments don't carry as much weight. If HQL was more different, then this would be more importatnt.
I usually use Criteria when I don't know what the inputs will be used on which pieces of data. Like on a search form where the user can enter any of 1 to 50 items and I don't know what they will be searching for. It is very easy to just append more to the criteria as I go through checking for what the user is searching for. I think it would be a little more troublesome to put an HQL query in that circumstance. HQL is great though when I know exactly what I want.
HQL is much easier to read, easier to debug using tools like the Eclipse Hibernate plugin, and easier to log. Criteria queries are better for building dynamic queries where a lot of the behavior is determined at runtime. If you don't know SQL, I could understand using Criteria queries, but overall I prefer HQL if I know what I want upfront.
Criteria API
Criteria API is better suited for dynamically generated queries. So, if you want to add WHERE clause filters, JOIN clauses, or vary the ORDER BY clause or the projection columns, then the Criteria API can help you generate the query dynamically in a way that also prevents SQL Injection attacks.
On the other hand, Criteria queries are less expressive and can even lead to very complicated and inefficient SQL queries.
JPQL and HQL
JPQL is the JPA standard entity query language while HQL extends JPQL and adds some Hibernate-specific features.
JPQL and HQL are very expressive and resemble SQL. Unlike Criteria API, JPQL and HQL make it easy to predict the underlying SQL query that's generated by the JPA provider. It's also much easier to review one's HQL queries than Criteria ones.
It's worth noting that selecting entities with JPQL or Criteria API makes sense if you need to modify them. Otherwise, a DTO projection is a much better choice.
Conclusion
If you don't need to vary the entity query structure, then use JPQL or HQL. If you need to change the filtering or sorting criteria or change the projection, then use Criteria API.
However, just because you are using JPA or Hibernate, it doesn't mean you should not use native SQL. SQL queries are very useful and JPQL and Criteria API are not a replacement for SQL.
Criteria are the only way to specify natural key lookups that take advantage of the special optimization in the second level query cache. HQL does not have any way to specify the necessary hint.
You can find some more info here:
http://tech.puredanger.com/2009/07/10/hibernate-query-cache/
Criteria Api is one of the good concept of Hibernate. according to my view these are the few point by which we can make difference between HQL and Criteria Api
HQL is to perform both select and non-select operations on the data, but Criteria is only for selecting the data, we cannot perform non-select operations using criteria.
HQL is suitable for executing Static Queries, where as Criteria is suitable for executing Dynamic Queries
HQL doesn’t support pagination concept, but we can achieve pagination with Criteria.
Criteria used to take more time to execute than HQL.
With Criteria we are safe with SQL Injection because of its dynamic query generation but in HQL as your queries are either fixed or parametrized, there is no safe from SQL Injection
To use the best of both worlds, the expressivity and conciseness of HQL and the dynamic nature of Criteria consider using Querydsl.
Querydsl supports JPA/Hibernate, JDO, SQL and Collections.
I am the maintainer of Querydsl, so this answer is biased.
For me Criteria is a quite easy to Understand and making Dynamic queries. But the flaw i say so far is that It loads all many-one etc relations because we have only three types of FetchModes i.e Select, Proxy and Default and in all these cases it loads many-one (may be i am wrong if so help me out :))
2nd issue with Criteria is that it loads complete object i.e if i want to just load EmpName of an employee it wont come up with this insted it come up with complete Employee object and i can get EmpName from it due to this it really work bad in reporting. where as HQL just load(did't load association/relations) what u want so increase performance many times.
One feature of Criteria is that it will safe u from SQL Injection because of its dynamic query generation where as in HQL as ur queries are either fixed or parameterised so are not safe from SQL Injection.
Also if you write HQL in ur aspx.cs files, then you are tightly coupled with ur DAL.
Overall my conclusion is that there are places where u can't live without HQL like reports so use them else Criteria is more easy to manage.
For me the biggest win on Criteria is the Example API, where you can pass an object and hibernate will build a query based on those object properties.
Besides that, the criteria API has its quirks (I believe the hibernate team is reworking the api), like:
a criteria.createAlias("obj") forces a inner join instead of a possible outer join
you can't create the same alias two times
some sql clauses have no simple criteria counterpart (like a subselect)
etc.
I tend to use HQL when I want queries similar to sql (delete from Users where status='blocked'), and I tend to use criteria when I don't want to use string appending.
Another advantage of HQL is that you can define all your queries before hand, and even externalise them to a file or so.
Criteria api provide one distinct feature that Neither SQL or HQL provides. ie. it allows compile time checking of a query.
We used mainly Criteria in our application in the beginning but after it was replaced with HQL due to the performance issues.
Mainly we are using very complex queries with several joins which leads to multiple queries in Criteria but is very optimized in HQL.
The case is that we use just several propeties on specific object and not complete objects. With Criteria the problem was also string concatenation.
Let say if you need to display name and surname of the user in HQL it is quite easy (name || ' ' || surname) but in Crteria this is not possible.
To overcome this we used ResultTransormers, where there were methods where such concatenation was implemented for needed result.
Today we mainly use HQL like this:
String hql = "select " +
"c.uuid as uuid," +
"c.name as name," +
"c.objective as objective," +
"c.startDate as startDate," +
"c.endDate as endDate," +
"c.description as description," +
"s.status as status," +
"t.type as type " +
"from " + Campaign.class.getName() + " c " +
"left join c.type t " +
"left join c.status s";
Query query = hibernateTemplate.getSessionFactory().getCurrentSession().getSession(EntityMode.MAP).createQuery(hql);
query.setResultTransformer(Transformers.ALIAS_TO_ENTITY_MAP);
return query.list();
so in our case the returned records are maps of needed properties.
Criteria query for dynamically we can construct query based on our inputs..In case of Hql query is the static query once we construct we can't change the structure of the query.
HQL is to perform both select and non-select operations on the data, but Criteria is only for selecting the data, we cannot perform non-select operations using criteria
HQL is suitable for executing Static Queries, where as Criteria is suitable for executing Dynamic Queries
HQL doesn’t support pagination concept, but we can achieve pagination with Criteria
Criteria used to take more time to execute then HQL
With Criteria we are safe with SQL Injection because of its dynamic query generation but in HQL as your queries are either fixed or parametrized, there is no safe from SQL Injection.
source
I don't want to kick a dead horse here, but it is important to mention that Criteria queries are now deprecated. Use HQL.
I also prefer Criteria Queries for dynamic queries. But I prefer hql for delete queries, for example if delete all records from child table for parent id 'xyz', It is easily achieved by HQL, but for criteria API first we must fire n number of delete query where n is number of child table records.
Most the answers here are misleading and mention that Criteria Queries are slower than HQL, which is actually not the case.
If you delve deep and perform some tests you will see Criteria Queries perform much better that regular HQL.
And also with Criteria Query you get Object Oriented control which is not there with HQL.
For more information read this answer here.
There is another way. I ended up with creating a HQL parser based on hibernate original syntax so it first parse the HQL then it could dynamically inject dynamic parameters or automatically adding some common filters for the HQL queries. It works great!
This post is quite old. Most answers talk about Hibernate criteria, not JPA criteria. JPA 2.1 added CriteriaDelete/CriteriaUpdate, and EntityGraph that controls what exactly to fetch. Criteria API is better since Java is OO. That is why JPA is created. When JPQL is compiled, it will be translated to AST tree(OO model) before translated to SQL.
Another point is that, I see Criteria is more suited for building on top of it and not to be used ditectly in the end-code.
It is more suited to build liberaries using it more than using jpql or hql.
For example I've build spring-data-jpa-mongodb-expressions using Criteria API (the same way spring data QBE do).
I think spring data query generations are using jpaql rather criteria which I don't understand why.
HQL can cause security concerns like SQL injection.

Fetch efficiently huge number of entities from DB meeting criteria

Let's say I have about half of million records in table A. Of course table A is joined with table B, C, etc. Now I have to fetch entities from table A which meet my criteria. My criteria consists of about 20-30 rules e.g. name in table A has to be like 'something' or date from table B for joined record from table A with ID=1 should be earlier than today. I see three solutions:
Write native query and put parameters from my criteria. But in this case I will join so many tables that it doesn't seem to me as a good example.
Fetch all records from table A and then check in Java every rule for each record. But this seems for me as the worst possible solution.
Use JPA Criteria. But to be honest I do not know how efficient they are while there are so many records and so many joined tables. What's more it seems to me like working with Criteria can be a little irritating when I have so many rules to match.
Maybe there is another (better) solution to my problem but I cannot see it now. I need to add that I need these fetched entities stored in Java collection because when they are all fetched then I have to work with them (e.g. generate report or create some updates in DB basing on these information).
I hope I described my problem clear and I will be thankful for every tip how to optimize such query.

Hibernate performance issue: query execution extremely slow

The following hibernate query is being used to fetch a list of ProductCatalogue records, by passing in catId and inventoryId
select prodcat from ProductCatalogue prodcat where prodcat.prodSec.prodId=:catId and prodcat.prodPlacedOrder.inventoryId=:inventoryId
The tables ProductCatalogue and ProdPlacedOrder are tables with 3 lakh + records. inventoryId is a column in prodOrder table, and prodPlacedOrder extends prodOrder table.
This query on execution takes a lot of time, and the single hibernate query shoots off many complex sql queries.
Any suggestions on what might be the issue and how to modify it such that the query is executed faster?
Difficult to say without more info but try making ProdPlacedOrder as LAZY fetch if you dont need any data from that table.
Also as phatmanace, mentioned - check your indices.

Hibernate - Perform HQL query on Criteria Result

I am concerned with a problem regarding criteria and hql.
In first step a complex criteria construction limits a big amount of datasets.
In second step a hql calculation should be performed on these preselected datasets of step one.
The problem is that the code of both has been developed seperately and i am wondering if it is possible to just perform the hql query as a subquery on the result of the first limitation of data. Both calculations should be performed on database level because it is a big amount of data.
I would be very happy for any help or advice.
Thanks in advance
I am wondering if it is possible to just perform the hql query as a subquery on the result of the first limitation of data.
To my knowledge, there is nothing like that in the API.

Categories