Optimize Spring-Data JPA queries - java

I am looking for possible optimizations for framework-generated queries.
As far as I understand, the process is the following:
you could declare your domain objects as POJOs and adding several annotations like #Entity, #Table, #ManyToOneetc.
you declare your repositories e.g. per interfaces
With (2) you have several options to describe your query: e.g. per Methodnames or #Query
If I write a query like:
#Query("select t from Order t LEFT join fetch t.orderPositions where t.id = ?1")
Page<Order> findById(Pageable pageable, String id);
a SQL-query is autogenerated, where every column of the order is resolved and subsequentially for orderpositions and depending obejcts/tables.
As if I wrote:
select * from order
So in case, that I need some Information from several joined objects, a query could be quite expensive: and more interesting quite ineffective. I stumbled upon a slow query and MySQL-explain told me, that in the generated query the optimizer could not make use of indices, which is bad.
Of course (I am aware) I have to deal with a tradeoff, that generated SQL isn't as optimal as manually written and have the advantage of writing less boilerplate code.
My question is: what are good strategies to improve queries, queryexecution?
I have thought for some options by myself:
1) Is it possible to define several "Entities" for different purposes, like Order for access to the full characteristics of an order and something like FilteredOrder with fewer columns and no resolution of Join-columns? Both would reference the same tables, but one would use all of the columns and the other only some.
2) Use #Query(... native="true") with a selection of all columns, which I want to use. The advantage of that would be, that I would not double my domain-objects and litter my codebase with hundreds of Filtered-Objects.
What about paging? Is using pageable in combination with #Query( ...native="true") still possible (I am afraid not).
3) Last but in my eyes "worst"/boilerplate solution: Use JDBCTemplates and do stuff at a lower level.
Are there other options, of which I haven't thought?
Thank you for any inspiration on that topic :]
Update:
Our current strategy is the following
1) Where possible, I work with select new
As I have seen, this works for every Object (be it an Entity or POJO)
2) In combination with database views it is possible to take the best of SQL and ORM. For some usecases it might be of interest to have an aggregated resultset at hand. Defining this resultset as a view makes it easy from the db-perspective to watch the result with a simple select-statement.
For the ORM-side this means, you could easily define an entity matching this view and you get the whole ORM-goodness on top: Paging incl.

One solution is to use DTO's:
#Query("select new FilteredOrder(o.name, o.size, o.cost) from Order o where o.id = ?1")
Page<FilteredOrder> findFilteredOrderById(Pageable pageable, String id);
If you want to have entities for some reports generation maybe you should think about using nosql datastore?

Take a look at JPA's lazy fetching strategy. It will allow you to select objects without their relations, but will fetch the relations when you reference them.

Related

Hibernate: initialization of complex object

I have problems with full loading of very complex object from DB in a reasonable time and with reasonable count of queries.
My object has a lot of embedded entities, each entity has references to another entities, another entities references yet another and so on (So, the nesting level is 6)
So, I've created example to demonstrate what I want:
https://github.com/gladorange/hibernate-lazy-loading
I have User.
User has #OneToMany collections of favorite Oranges,Apples,Grapevines and Peaches. Each Grapevine has #OneToMany collection of Grapes. Each fruit is another entity with just one String field.
I'm creating user with 30 favorite fruits of each type and each grapevine has 10 grapes. So, totally I have 421 entity in DB - 30*4 fruits, 100*30 grapes and one user.
And what I want: I want to load them using no more than 6 SQL queries.
And each query shouldn't produce big result set (big is a result set with more that 200 records for that example).
My ideal solution will be the following:
6 requests. First request returns information about user and size of result set is 1.
Second request return information about Apples for this user and size of result set is 30.
Third, Fourth and Fifth requests returns the same, as second (with result set size = 30) but for Grapevines, Oranges and Peaches.
Sixth request returns Grape for ALL grapevines
This is very simple in SQL world, but I can't achieve such with JPA (Hibernate).
I tried following approaches:
Use fetch join, like from User u join fetch u.oranges .... This is awful. The result set is 30*30*30*30 and execution time is 10 seconds. Number of requests = 3. I tried it without grapes, with grapes you will get x10 size of result set.
Just use lazy loading. This is the best result in this example (with #Fetch=
SUBSELECT for grapes). But in that case that I need to manually iterate over each collection of elements. Also, subselect fetch is too global setting, so I would like to have something which could work on query level. Result set and time near ideal. 6 queries and 43 ms.
Loading with entity graph. The same as fetch join but it also make request for every grape to get it grapevine. However, result time is better (6 seconds), but still awful. Number of requests > 30.
I tried to cheat JPA with "manual" loading of entities in separate query. Like:
SELECT u FROM User where id=1;
SELECT a FROM Apple where a.user_id=1;
This is a little bit worse that lazy loading, since it requires two queries for each collection: first query to manual loading of entities (I have full control over this query, including loading associated entities), second query to lazy-load the same entities by Hibernate itself (This is executed automatically by Hibernate)
Execution time is 52, number of queries = 10 (1 for user, 1 for grape, 4*2 for each fruit collection)
Actually, "manual" solution in combination with SUBSELECT fetch allows me to use "simple" fetch joins to load necessary entities in one query (like #OneToOne entities) So I'm going to use it. But I don't like that I have to perform two queries to load collection.
Any suggestions?
I usually cover 99% of such use cases by using batch fetching for both entities and collections. If you process the fetched entities in the same transaction/session in which you read them, then there is nothing additionally that you need to do, just navigate to the associations needed by the processing logic and the generated queries will be very optimal. If you want to return the fetched entities as detached, then you initialize the associations manually:
User user = entityManager.find(User.class, userId);
Hibernate.initialize(user.getOranges());
Hibernate.initialize(user.getApples());
Hibernate.initialize(user.getGrapevines());
Hibernate.initialize(user.getPeaches());
user.getGrapevines().forEach(grapevine -> Hibernate.initialize(grapevine.getGrapes()));
Note that the last command will not actually execute a query for each grapevine, as multiple grapes collections (up to the specified #BatchSize) are initialized when you initialize the first one. You simply iterate all of them to make sure all are initialized.
This technique resembles your manual approach but is more efficient (queries are not repeated for each collection), and is more readable and maintainable in my opinion (you just call Hibernate.initialize instead of manually writing the same query that Hibernate generates automatically).
I'm going to suggest yet another option on how to lazily fetch collections of Grapes in Grapevine:
#OneToMany
#BatchSize(size = 30)
private List<Grape> grapes = new ArrayList<>();
Instead of doing a sub-select this one would use in (?, ?, etc) to fetch many collections of Grapes at once. Instead ? Grapevine IDs will be passed. This is opposed to querying 1 List<Grape> collection at a time.
That's just yet another technique to your arsenal.
I do not quite understand your demands here. It seems to me you want Hibernate to do something that it's not designed to do, and when it can't, you want a hack-solution that is far from optimal. Why not loosen the restrictions and get something that works? Why do you even have these restrictions in the first place?
Some general pointers:
When using Hibernate/JPA, you do not control the queries. You are not supposed to either (with a few exceptions). How many queries, the order they are executed in, etc, is pretty much beyond your control. If you want complete control of your queries, just skip JPA and use JDBC instead (Spring JDBC for instance.)
Understanding lazy-loading is key to making decisions in these type of situation. Lazy-loaded relations are not fetched when getting the owning entity, instead Hibernate goes back to the database and gets them when they are actually used. Which means that lazy-loading pays off if you don't use the attribute every time, but has a penalty the times you actually use it. (Fetch join is used for eager-fetching a lazy relation. Not really meant for use with regular load from the database.)
Query optimalization using Hibernate should not be your first line of action. Always start with your database. Is it modelled correctly, with primary keys and foreign keys, normal forms, etc? Do you have search indexes on proper places (typically on foreign keys)?
Testing for performance on a very limited dataset probably won't give the best results. There probably will be overhead with connections, etc, that will be larger than the time spent actually running the queries. Also, there might be random hickups that cost a few milliseconds, which will give a result that might be misleading.
Small tip from looking at your code: Never provide setters for collections in entities. If actually invoked within a transaction, Hibernate will throw an exception.
tryManualLoading probably does more than you think. First, it fetches the user (with lazy loading), then it fetches each of the fruits, then it fetches the fruits again through lazy-loading. (Unless Hibernate understands that the queries will be the same as when lazy loading.)
You don't actually have to loop through the entire collection in order to initiate lazy-loading. You can do this user.getOranges().size(), or Hibernate.initialize(user.getOranges()). For the grapevine you would have to iterate to initialize all the grapes though.
With proper database design, and lazy-loading in the correct places, there shouldn't be a need for anything other than:
em.find(User.class, userId);
And then maybe a join fetch query if a lazy load takes a lot of time.
In my experience, the most important factor for speeding up Hibernate is search indexes in the database.

Is there a need to use hibernate Criteria? [duplicate]

What are the pros and cons of using Criteria or HQL? The Criteria API is a nice object-oriented way to express queries in Hibernate, but sometimes Criteria Queries are more difficult to understand/build than HQL.
When do you use Criteria and when HQL? What do you prefer in which use cases? Or is it just a matter of taste?
I mostly prefer Criteria Queries for dynamic queries. For example it is much easier to add some ordering dynamically or leave some parts (e.g. restrictions) out depending on some parameter.
On the other hand I'm using HQL for static and complex queries, because it's much easier to understand/read HQL. Also, HQL is a bit more powerful, I think, e.g. for different join types.
There is a difference in terms of performance between HQL and criteriaQuery, everytime you fire a query using criteriaQuery, it creates a new alias for the table name which does not reflect in the last queried cache for any DB. This leads to an overhead of compiling the generated SQL, taking more time to execute.
Regarding fetching strategies [http://www.hibernate.org/315.html]
Criteria respects the laziness settings in your mappings and guarantees that what you want loaded is loaded. This means one Criteria query might result in several SQL immediate SELECT statements to fetch the subgraph with all non-lazy mapped associations and collections. If you want to change the "how" and even the "what", use setFetchMode() to enable or disable outer join fetching for a particular collection or association. Criteria queries also completely respect the fetching strategy (join vs select vs subselect).
HQL respects the laziness settings in your mappings and guarantees that what you want loaded is loaded. This means one HQL query might result in several SQL immediate SELECT statements to fetch the subgraph with all non-lazy mapped associations and collections. If you want to change the "how" and even the "what", use LEFT JOIN FETCH to enable outer-join fetching for a particular collection or nullable many-to-one or one-to-one association, or JOIN FETCH to enable inner join fetching for a non-nullable many-to-one or one-to-one association. HQL queries do not respect any fetch="join" defined in the mapping document.
Criteria is an object-oriented API, while HQL means string concatenation. That means all of the benefits of object-orientedness apply:
All else being equal, the OO version is somewhat less prone to error. Any old string could get appended into the HQL query, whereas only valid Criteria objects can make it into a Criteria tree. Effectively, the Criteria classes are more constrained.
With auto-complete, the OO is more discoverable (and thus easier to use, for me at least). You don't necessarily need to remember which parts of the query go where; the IDE can help you
You also don't need to remember the particulars of the syntax (like which symbols go where). All you need to know is how to call methods and create objects.
Since HQL is very much like SQL (which most devs know very well already) then these "don't have to remember" arguments don't carry as much weight. If HQL was more different, then this would be more importatnt.
I usually use Criteria when I don't know what the inputs will be used on which pieces of data. Like on a search form where the user can enter any of 1 to 50 items and I don't know what they will be searching for. It is very easy to just append more to the criteria as I go through checking for what the user is searching for. I think it would be a little more troublesome to put an HQL query in that circumstance. HQL is great though when I know exactly what I want.
HQL is much easier to read, easier to debug using tools like the Eclipse Hibernate plugin, and easier to log. Criteria queries are better for building dynamic queries where a lot of the behavior is determined at runtime. If you don't know SQL, I could understand using Criteria queries, but overall I prefer HQL if I know what I want upfront.
Criteria API
Criteria API is better suited for dynamically generated queries. So, if you want to add WHERE clause filters, JOIN clauses, or vary the ORDER BY clause or the projection columns, then the Criteria API can help you generate the query dynamically in a way that also prevents SQL Injection attacks.
On the other hand, Criteria queries are less expressive and can even lead to very complicated and inefficient SQL queries.
JPQL and HQL
JPQL is the JPA standard entity query language while HQL extends JPQL and adds some Hibernate-specific features.
JPQL and HQL are very expressive and resemble SQL. Unlike Criteria API, JPQL and HQL make it easy to predict the underlying SQL query that's generated by the JPA provider. It's also much easier to review one's HQL queries than Criteria ones.
It's worth noting that selecting entities with JPQL or Criteria API makes sense if you need to modify them. Otherwise, a DTO projection is a much better choice.
Conclusion
If you don't need to vary the entity query structure, then use JPQL or HQL. If you need to change the filtering or sorting criteria or change the projection, then use Criteria API.
However, just because you are using JPA or Hibernate, it doesn't mean you should not use native SQL. SQL queries are very useful and JPQL and Criteria API are not a replacement for SQL.
Criteria are the only way to specify natural key lookups that take advantage of the special optimization in the second level query cache. HQL does not have any way to specify the necessary hint.
You can find some more info here:
http://tech.puredanger.com/2009/07/10/hibernate-query-cache/
Criteria Api is one of the good concept of Hibernate. according to my view these are the few point by which we can make difference between HQL and Criteria Api
HQL is to perform both select and non-select operations on the data, but Criteria is only for selecting the data, we cannot perform non-select operations using criteria.
HQL is suitable for executing Static Queries, where as Criteria is suitable for executing Dynamic Queries
HQL doesn’t support pagination concept, but we can achieve pagination with Criteria.
Criteria used to take more time to execute than HQL.
With Criteria we are safe with SQL Injection because of its dynamic query generation but in HQL as your queries are either fixed or parametrized, there is no safe from SQL Injection
To use the best of both worlds, the expressivity and conciseness of HQL and the dynamic nature of Criteria consider using Querydsl.
Querydsl supports JPA/Hibernate, JDO, SQL and Collections.
I am the maintainer of Querydsl, so this answer is biased.
For me Criteria is a quite easy to Understand and making Dynamic queries. But the flaw i say so far is that It loads all many-one etc relations because we have only three types of FetchModes i.e Select, Proxy and Default and in all these cases it loads many-one (may be i am wrong if so help me out :))
2nd issue with Criteria is that it loads complete object i.e if i want to just load EmpName of an employee it wont come up with this insted it come up with complete Employee object and i can get EmpName from it due to this it really work bad in reporting. where as HQL just load(did't load association/relations) what u want so increase performance many times.
One feature of Criteria is that it will safe u from SQL Injection because of its dynamic query generation where as in HQL as ur queries are either fixed or parameterised so are not safe from SQL Injection.
Also if you write HQL in ur aspx.cs files, then you are tightly coupled with ur DAL.
Overall my conclusion is that there are places where u can't live without HQL like reports so use them else Criteria is more easy to manage.
For me the biggest win on Criteria is the Example API, where you can pass an object and hibernate will build a query based on those object properties.
Besides that, the criteria API has its quirks (I believe the hibernate team is reworking the api), like:
a criteria.createAlias("obj") forces a inner join instead of a possible outer join
you can't create the same alias two times
some sql clauses have no simple criteria counterpart (like a subselect)
etc.
I tend to use HQL when I want queries similar to sql (delete from Users where status='blocked'), and I tend to use criteria when I don't want to use string appending.
Another advantage of HQL is that you can define all your queries before hand, and even externalise them to a file or so.
Criteria api provide one distinct feature that Neither SQL or HQL provides. ie. it allows compile time checking of a query.
We used mainly Criteria in our application in the beginning but after it was replaced with HQL due to the performance issues.
Mainly we are using very complex queries with several joins which leads to multiple queries in Criteria but is very optimized in HQL.
The case is that we use just several propeties on specific object and not complete objects. With Criteria the problem was also string concatenation.
Let say if you need to display name and surname of the user in HQL it is quite easy (name || ' ' || surname) but in Crteria this is not possible.
To overcome this we used ResultTransormers, where there were methods where such concatenation was implemented for needed result.
Today we mainly use HQL like this:
String hql = "select " +
"c.uuid as uuid," +
"c.name as name," +
"c.objective as objective," +
"c.startDate as startDate," +
"c.endDate as endDate," +
"c.description as description," +
"s.status as status," +
"t.type as type " +
"from " + Campaign.class.getName() + " c " +
"left join c.type t " +
"left join c.status s";
Query query = hibernateTemplate.getSessionFactory().getCurrentSession().getSession(EntityMode.MAP).createQuery(hql);
query.setResultTransformer(Transformers.ALIAS_TO_ENTITY_MAP);
return query.list();
so in our case the returned records are maps of needed properties.
Criteria query for dynamically we can construct query based on our inputs..In case of Hql query is the static query once we construct we can't change the structure of the query.
HQL is to perform both select and non-select operations on the data, but Criteria is only for selecting the data, we cannot perform non-select operations using criteria
HQL is suitable for executing Static Queries, where as Criteria is suitable for executing Dynamic Queries
HQL doesn’t support pagination concept, but we can achieve pagination with Criteria
Criteria used to take more time to execute then HQL
With Criteria we are safe with SQL Injection because of its dynamic query generation but in HQL as your queries are either fixed or parametrized, there is no safe from SQL Injection.
source
I don't want to kick a dead horse here, but it is important to mention that Criteria queries are now deprecated. Use HQL.
I also prefer Criteria Queries for dynamic queries. But I prefer hql for delete queries, for example if delete all records from child table for parent id 'xyz', It is easily achieved by HQL, but for criteria API first we must fire n number of delete query where n is number of child table records.
Most the answers here are misleading and mention that Criteria Queries are slower than HQL, which is actually not the case.
If you delve deep and perform some tests you will see Criteria Queries perform much better that regular HQL.
And also with Criteria Query you get Object Oriented control which is not there with HQL.
For more information read this answer here.
There is another way. I ended up with creating a HQL parser based on hibernate original syntax so it first parse the HQL then it could dynamically inject dynamic parameters or automatically adding some common filters for the HQL queries. It works great!
This post is quite old. Most answers talk about Hibernate criteria, not JPA criteria. JPA 2.1 added CriteriaDelete/CriteriaUpdate, and EntityGraph that controls what exactly to fetch. Criteria API is better since Java is OO. That is why JPA is created. When JPQL is compiled, it will be translated to AST tree(OO model) before translated to SQL.
Another point is that, I see Criteria is more suited for building on top of it and not to be used ditectly in the end-code.
It is more suited to build liberaries using it more than using jpql or hql.
For example I've build spring-data-jpa-mongodb-expressions using Criteria API (the same way spring data QBE do).
I think spring data query generations are using jpaql rather criteria which I don't understand why.
HQL can cause security concerns like SQL injection.

Hibernate pagination with ____ToMany mapping

I'm writing this on the fly on my phone, so forgive the crappy code samples.
I have entities with a manytomany relationship:
#JoinTable(name="foo", #JoinColum="...", #InverseJoinColumn="...")
#ManyToMany
List list = new ArrayList();
I want their data to be retrieved in a paginated way.
I know about setFirstResult and setMaxResults. Is there a way to use this with the mapping? As in, I retrieve the object and get the list filled with contents equal to the amount of records for a single page, with the appropriate offset.
I guess I'm just unclear of the best way to do this. I could just manually use hibernate criteria to have the effect, but I feel thats missing the API. I have this mapping, I want to see if there's a way to use it in a paginated way.
PS. If this is impractical, just say. Also, if it is, can I still use the mapping to add new entries to the join table. As in, if the entity is a persisted entity in the DB, but I haven't fetched the manytomany list, can I add something new to it and when its persisted with cascade all it'll be added to the join table without clearing the other entries?
The type of the relationship between entities that are part of your query isn't that important. There are a couple of ways to tackle this.
If your database supports the LIMIT keyword in it's queries, you would be able to use it to get data sets, assuming you sort your data. Note that if your data changes while your user is navigating between pages, you might see some duplication or miss some records. You'll be stuck having to rewrite if your database changes to one that doesn't have the LIMIT keyword.
If you need to freeze the data at the point of the original query you need to use a 3rd party framework or write your own to fetch a list of Ids for your query then split up that list and fetch by id in a subset for pagination. This is more reliable can be made to work for any database.
Displaytag is a data paging framework I've used and that I therefore can tell you works well for large datasets. It's also one of the older solutions for this problem and is not part of an extended framework.
http://displaytag.sourceforge.net/11/tut_externalSortAndPage.html
Table sorter is another one I came across. This one uses JQuery and fetches the entire data set in one query, so strictly speaking it doesn't meet your "fetches the data in a paginated way" criteria. (This might not be appropriate for large sets).
http://tablesorter.com/docs/
This tutorial might be helpful:
http://theopentutorials.com/examples/java-ee/jsp/pagination-in-servlet-and-jsp/
If you're already using a framework take a look at whether that framework has tackled pagination:
Spring MVC provides a data pager
http://blog.fawnanddoug.com/2012/05/pagination-with-spring-mvc-spring-data.html
GWT provides a data pager:
http://www.gwtproject.org/javadoc/latest/com/google/gwt/user/cellview/client/SimplePager.html
The following refrences might be helpful too:
JDBC Pagination
which also points to:
http://java.avdiel.com/Tutorials/JDBCPaging.html

Avoiding N+One selects and Invalid results from eclipselink with batch read

I'm trying to cut down the number of n+1 selects incurred by my application, the application uses EclipseLink as an ORM and in as many places as possible I've tried to add the batch read hint to queries. In a large number of places in the app I don't always know exactly what relationships I'll be traversing (My view displays fields based on user preferences). At that point I'd like to run one query to populate all of those relationships for my objects.
My dream is to call something like ReadAllRelationshipsQuery(Collection,RelationshipName) and populate all of these items so that later calls to:
Collection.get(0).getMyStuff will already be populated and not cause a db query. How can I accomplish this? I'm willing to write any code I need to but I can't find a way that work with the eclipselink framework?
Why don't I just batch read all of the possible fields and let them load lazily? What I've found is that the batch value holders that implement batch reads don't behave well with the eclipselink cache. If a batch read value holder isn't "evaluated" and ends up in the eclipse link cache it can become stale and return incorrect data (This behavior was logged as an eclipselink bug but rejected...)
edit: I found the link to the bug here: https://bugs.eclipse.org/bugs/show_bug.cgi?id=326197
How do I avoid N+1 selects for objects I already have a reference to?
You have three basic ways to load data into objects from a JPA-based solution. These are:
Load dynamically by object traversal (e.g. myObject.getMyCollection().get()).
Load graphs of objects by prefetching dynamically using JPA QL (e.g. FETCH JOINs as described at the Oracle JPA tutorial )
Load by setting the fetch mode ( Is there a way to change the JPA fetch type on a method? )
Each of these has pros and cons.
Loading dynamically by object transversal will generate more (highly targeted queries). These queries are usually small (not large SQL statements, but may load lots of data) and tend to play nicely with a second level cache, but you can get lots and lots of little queries.
Prefetching with JPA QL will give you exactly what you want, but that assumes that you know what you want.
Setting the fetch mode to EAGER will load lots and lots of data for you automatically, but depending on the configuration and usage this may not actually help much (or may make things a lot worse) as you may wind up dragging a LOT of data from the DB into your app that you didn't expect.
Regardless, I highly recommend using p6spy ( http://sourceforge.net/projects/p6spy/ ) in conjunction with any JPA-based application to understand the effects of your tuning.
Unfortunately, JPA makes some things easy and some things hard - mainly, side-effects of your usage. For example, you might fix one problem by setting the fetch mode to eager, and then create another problem where the eager fetch pulls in too much data. EclipseLink does provide tooling to help sort this out ( EclipseLink Performance Tools )
In theory, if you wanted to you could write a generic JavaBean property walker by using something like Apache BeanUtils. Usually just calling a method like size() on a collection is enough to force it to load (although using a collection batch fetch size might complicate things a bit).
One thing to pay particular attention to is the scope of your session and your use of caches (EclipseLink cache).
Something not clear from your post is the scope of a session. Is a session a one shot affair (e.g. like a web page request) or is it a long running thing (e.g. like a classic client/server GUI app)?
It is very difficult to optimize the retrieval of relationships if you do not know what relationships you require.
If you application is requesting what relationships it wants, then you must know at some level which relationships you require, and should be able to optimize these in your query for the objects.
For an overview of relationship optimization techniques see,
http://java-persistence-performance.blogspot.com/2010/08/batch-fetching-optimizing-object-graph.html
For Batch Fetching, there are three types, JOIN, EXISTS, and IN. The problem you outlined of changes to data affecting the original query for cache batched relationships only applies to JOIN and EXISTS, and only when you have a selection criteria based on updateale fields, (if the query you are optimizing is on id, or all instances you are ok). IN batch fetching does not have this issue, so you can use IN batch fetching for all the relationships and not have this issue.
ReadAllRelationshipsQuery(Collection,RelationshipName)
How about,
Query query = em.createQuery("Select o from MyObject o where o.id in :ids");
query.setParameter(ids, ids);
query.setHint("eclipselink.batch", relationship);
If you know all possible relations and the user preferences, why don't you just dynamically build the JPQL string (or Criteria) before executing it?
Like:
String sql = "SELECT u FROM User u"; //use a StringBuilder, this is just for simplity's sake
if(loadAdress)
{
sql += " LEFT OUTER JOIN u.address as a"; //fetch join and left outer join have the same result in many cases, except that with left outer join you could load associations of address as well
}
...
Edit: Since the result would be a cross product, you should then iterate over the entities and remove duplicates.
In the query, use FETCH JOIN to prefetch relationships.
Keep in mind that the resulting rows will be the cross product of all rows selected, which can easily be more work than the N+1 queries.

Is HibernateCallback best for executing SQL/procedures?

I'm working on a web based application that belongs to an automobil manufacturer, developed in Spring-Hibernate with MS SQL Server 2005 database.
There are three kind of use cases:
1) Through this application, end users can request for creating a Car, Bus, Truck etc through web based interfaces. When a user logs in, a HTML form gets displayed for capturing technical specification of vehicle, for ex, if someone wanted to request for Car, he can speify the Engine Make/Model, Tire, Chassis details etc and submit the form. I'm using Hibernate here for persistence, i.e. I've a Car Entity that gets saved in DB for each such request.
2) This part of the application deals with generation of reports. These reports mainly dela with number of requests received in a day and the summary. Some of the reports calculate Turnaround time for individual Create vehicle requests.
I'm using plain JDBC calls with Preparedstatement (if report can be generated with SQLs), Callablestatement (if report is complex enough and needs a DB procedure/Function to fetch all details) and HibernateCallback to execute the SQLs/Procedures and display information on screen.
3) Search: This part of application allows ensd users to search for various requests data, i.e. how many vehicle have been requested in a Year etc. I'm using DB procedure with CallableStatement..Once again executing these procedures within HibernateCallback, populating and returning search result on GUI in a POJO.
I'm using native SQL in (2) and (3) above, because for the reporting/search purpose the report data structure to display on screen is not matching with any of my Entity. For ex: Car entity has got more than 100 attributes in itself, but for reporting purpose I don't need more than 10 of them.. so i just though loading all 100 attributes does not make any sense, so why not use plain SQL and retrieve just the data needed for displaying on screen.
Similarly for Search, I had to write procedures/Functions because search algorithm is not straight forward and Hibernate has no way to write a stored procedure kind of thing.
This is working fine for proto type, however I would like to know
a. If my approach for using native SQLs and DB procedures are fine for case 2 and 3 based on my judgement.
b. Also whether executing SQLs in HibernateCallback is correct approach?
Need expert's help.
I would like to know (...) if my approach for using native SQLs and DB procedures are fine for case 2 and 3 based on my judgment
Nothing forces your to use a stored procedure for case 2, you could use HQL and projections as already pointed out:
select f.id, f.firstName from Foo f where ...
Which would return an Object[] or a List<Object[]> depending on the where condition.
And if you want type safe results, you could use a SELECT NEW expression (assuming you're providing the appropriate constructor):
select new Foo(f.id, f.firstName) from Foo f
And you can even return non entities
select new com.acme.LigthFoo(f.id, f.firstName) from Foo f
For case 3, the situation seems different. Just in case, note that the Criteria API is more appropriate than HQL to build dynamic queries. But it looks like this won't help here.
I would like to know (...) whether executing SQLs in HibernateCallback is correct approach?
First of all, there are several restrictions when using stored procedures and I prefer to avoid them when possible. Secondly, if you want to return entities, it isn't the only way and simplest solution as we saw. So for case 2, I would consider using HQL.
For case 3, since you aren't returning entities at all, I would consider not using Hibernate API but the JDBC support from Spring which offers IMHO a cleaner API than Session#connection() and the HibernateCallback.
More interesting readings:
References
Hibernate Core reference guide
14.6. The select clause (about the select new)
16.1.5. Returning non-managed entities (about ResultTransformer)
16.2.2. Using stored procedures for querying
Resources
Hibernate 3.2: Transformers for HQL and SQL
Related questions
hibernate SQLquery extract variable
hibernate query language or using criteria
You should strive to use as much HQL as possible, unless you have a good argument (like performance, but do a benchmark first). If the use of native queries becomes to excessive, you should consider whether Hibernate has been a good choice.
Note a few things:
you can have native queries and stored procedures that result in Hibernate entities. You just have to map the query / storproc call to a class and call it by session.createSQLQuery(queryName)
If you really need to construct native queries at runtime, the newest version of hibernate have a doWork(..) method, by which you can do JDBC work.
You say
For ex: Car entity has got more than 100 attributes in itself, but for reporting purpose I don't need more than 10 of them.. so i just though loading all 100 attributes does not make any sense
but HQL in hibernate allows you to do a projection (select only a subset of the columns back). You don't have to pull the entire entity if you don't want to.
Then you get all the benefits of HQL (typing of results, HQL join syntax) but you can pretty much write SQLish code.
See here for the HQL docs and here for the select syntax. If you're used to SQL it's pretty easy.
So to answer you directly
a - No, I think you should be using HQL
b - Becomes irrelevant if you go with my suggestion for a.

Categories