Is HibernateCallback best for executing SQL/procedures? - java

I'm working on a web based application that belongs to an automobil manufacturer, developed in Spring-Hibernate with MS SQL Server 2005 database.
There are three kind of use cases:
1) Through this application, end users can request for creating a Car, Bus, Truck etc through web based interfaces. When a user logs in, a HTML form gets displayed for capturing technical specification of vehicle, for ex, if someone wanted to request for Car, he can speify the Engine Make/Model, Tire, Chassis details etc and submit the form. I'm using Hibernate here for persistence, i.e. I've a Car Entity that gets saved in DB for each such request.
2) This part of the application deals with generation of reports. These reports mainly dela with number of requests received in a day and the summary. Some of the reports calculate Turnaround time for individual Create vehicle requests.
I'm using plain JDBC calls with Preparedstatement (if report can be generated with SQLs), Callablestatement (if report is complex enough and needs a DB procedure/Function to fetch all details) and HibernateCallback to execute the SQLs/Procedures and display information on screen.
3) Search: This part of application allows ensd users to search for various requests data, i.e. how many vehicle have been requested in a Year etc. I'm using DB procedure with CallableStatement..Once again executing these procedures within HibernateCallback, populating and returning search result on GUI in a POJO.
I'm using native SQL in (2) and (3) above, because for the reporting/search purpose the report data structure to display on screen is not matching with any of my Entity. For ex: Car entity has got more than 100 attributes in itself, but for reporting purpose I don't need more than 10 of them.. so i just though loading all 100 attributes does not make any sense, so why not use plain SQL and retrieve just the data needed for displaying on screen.
Similarly for Search, I had to write procedures/Functions because search algorithm is not straight forward and Hibernate has no way to write a stored procedure kind of thing.
This is working fine for proto type, however I would like to know
a. If my approach for using native SQLs and DB procedures are fine for case 2 and 3 based on my judgement.
b. Also whether executing SQLs in HibernateCallback is correct approach?
Need expert's help.

I would like to know (...) if my approach for using native SQLs and DB procedures are fine for case 2 and 3 based on my judgment
Nothing forces your to use a stored procedure for case 2, you could use HQL and projections as already pointed out:
select f.id, f.firstName from Foo f where ...
Which would return an Object[] or a List<Object[]> depending on the where condition.
And if you want type safe results, you could use a SELECT NEW expression (assuming you're providing the appropriate constructor):
select new Foo(f.id, f.firstName) from Foo f
And you can even return non entities
select new com.acme.LigthFoo(f.id, f.firstName) from Foo f
For case 3, the situation seems different. Just in case, note that the Criteria API is more appropriate than HQL to build dynamic queries. But it looks like this won't help here.
I would like to know (...) whether executing SQLs in HibernateCallback is correct approach?
First of all, there are several restrictions when using stored procedures and I prefer to avoid them when possible. Secondly, if you want to return entities, it isn't the only way and simplest solution as we saw. So for case 2, I would consider using HQL.
For case 3, since you aren't returning entities at all, I would consider not using Hibernate API but the JDBC support from Spring which offers IMHO a cleaner API than Session#connection() and the HibernateCallback.
More interesting readings:
References
Hibernate Core reference guide
14.6. The select clause (about the select new)
16.1.5. Returning non-managed entities (about ResultTransformer)
16.2.2. Using stored procedures for querying
Resources
Hibernate 3.2: Transformers for HQL and SQL
Related questions
hibernate SQLquery extract variable
hibernate query language or using criteria

You should strive to use as much HQL as possible, unless you have a good argument (like performance, but do a benchmark first). If the use of native queries becomes to excessive, you should consider whether Hibernate has been a good choice.
Note a few things:
you can have native queries and stored procedures that result in Hibernate entities. You just have to map the query / storproc call to a class and call it by session.createSQLQuery(queryName)
If you really need to construct native queries at runtime, the newest version of hibernate have a doWork(..) method, by which you can do JDBC work.

You say
For ex: Car entity has got more than 100 attributes in itself, but for reporting purpose I don't need more than 10 of them.. so i just though loading all 100 attributes does not make any sense
but HQL in hibernate allows you to do a projection (select only a subset of the columns back). You don't have to pull the entire entity if you don't want to.
Then you get all the benefits of HQL (typing of results, HQL join syntax) but you can pretty much write SQLish code.
See here for the HQL docs and here for the select syntax. If you're used to SQL it's pretty easy.
So to answer you directly
a - No, I think you should be using HQL
b - Becomes irrelevant if you go with my suggestion for a.

Related

JPA Hibernate query vs Native query

I use Spring Data JPA (hibernate) generated queries for fetching data from my Sqlserver. Now i am getting performance related issues in my system.
Load findByLoadId(Integer loadId);
This is the query i am using to get data. This query returns 25 cell data but i only use 5 data from that.
can i use direct native query like
select id,date,createdBy,createdOn,loadName from Load where
loadId=:loadId
but if native query is suggestable then I am having question like Does ORM frameWork reduce performence by getting unneeded data from Database?
By "data cell" I assume that you are referring to database table columns, and not to records. The answer to your question is that yes, ORM frameworks might tend to just do a SELECT * under the hood, which can result in unwanted information being sent across the network to your application. If the JPA repository interface is behaving this way, you may switch to either an explicit JPA query (e.g. using the #Query annotation), or even a native query. Then, just select the columns you want. The issue here is that ORM frameworks map object templates (e.g. classes) to entire database tables. So, the concept of entity implicitly includes every database column. If you go with the option of selecting only certain columns, you may need to do some juggling on the Java side. Note that if the use a JPA query, your code would still, in theory, be database independent.

Single line select using string builder or Stored Procedure

I have a lot of single line select queries in my application with multiple joins spanning 5-6 tables. These queries are generated based on many conditions based on input from a form etc using String Builders. However my team lead who happens to be a sql developer has asked me to convert those single line queries to Stored Procedures.
Is there any advantage of converting the single line select queries to backend and performing all the if and else there as SP.
One advantage of having all your sql part in stored procedures is that you keep your queries in one place that is database so it would a lot easier to change or modify without making a lot of changes in application layer or front end layer.
Besides DBA's or SQL develoeprs could fine tune the SQL's if it is stored in database procedures. You could keep all your functions/stored procedures in a package which would be better in terms of performance and organizing your objects(similar way of creating packages in Java). And of course in packages you could restrict direct access to its objects.
This is more of team's or department policy where to keep the sql part whether in front end or in database itself and of course like #Gimby mentioned, many people could have different views.
Update 1
If you have a select statement which returns something use a function, if you have INSERT/UPDATE/DELETE or similar stuff like sending emails or other business rules then use a procedure and call these from front end by passing parameters.
I'm afraid that is a question that will result in many different answers based on many different personal opinions.
Its business logic you are talking about here in any case, in -my- opinion that belongs in the application layer. But I know a whole club of Oracle devs who wholeheartedly disagree with me.
If your use PreparedStatement in java then there is no big differense in performance between
java queries and stored procedures. (If your use Statement in java, then your have a problem).
But Stored Procedure is a good way to organize and reuse your sql code. Your can group them in packages, your can change them without java compilation and your DBA or SQL spetialist can tune them.

Better option to fetch results from database tables

Are there any performance improvement in calling a procedure which returns SYS_RECURSOR or call a query?
For example
CREATE OR REPLACE PROCEDURE my_proc
(
p_id number,
emp_cursor IN OUT SYS_REFCURSOR
)
AS
BEGIN
OPEN emp_cursor for
select * from emp where emp_number=p_id
end;
/
and call the above from Java by registering OUT parameter,pass IN parameter and fetch the results.
Or
From Java get the results from emp table by
preparedStatement = prepareStatement(connection, "select * from emp where emp_number=?", values);
resultSet = preparedStatement.executeQuery();
Which one of the above is a better option to call from Java?
There is no performance difference assuming your prepareStatement method is using the appropriate type for all bind variables. That is, you would need to ensure that you are calling setLong, setDate, setString, etc. depending on the data type of the parameter. If you bind the data incorrectly (i.e. calling setString to bind a numeric value), you may force Oracle to do data type conversion which may prevent the optimizer from using an index that would improve performance.
From a code organization and maintenance standpoint, however, I would rather have the queries in the database rather than in the Java application. If you find that a query is using a poor plan, for example, it's likely to be much easier for a DBA to address the problem if the query is in a stored procedure than if the query is embedded in a Java application. If the query is stored in the database, you can also use the database's dependency tracking functions to more easily do an impact analysis if you need to do something like determine what would be impacted if the emp table needs to change.
Well, I don't think there is major significant difference from the Java invocation standpoint.
Some differencesI can think of are:
You will now have to maintain two different code bases: your Java code and your stored procedures. In case of errors, you will have to debug in two different places, and fix problems in two different places.
Once production-ready, making changes to the database is probably going to require some additional formalisms besides those required to change the Java code deployed.
Another important matter to take into account is database-independence, if you are building a product to work with different kinds of databases, you would be forced to write different versions of your stored procedures and you will have more code to maintain (debug, bugfix, change, etc).
This very important if you're building a product that you intend to deploy in different environments of different (possible yet unknown) clients, wich you cannot predict what RDBMS will be using.
If you want to use an ORM framework i.e. Hibernate, EclipseLink) it will generate pretty optimized queries for you. Plus, it would be more difficult to integrate it later on if you use stored-procedures.
With proper amount of logging is easy to analyze your queries for optimization purposes. You could use JDBC logging or the logging provided by your ORM provider and actually see how the query is being used by the application, how many times, how often, etc, and optimize where it matters.

Hibernate produce different SQL for every query

I've just tested my application under the profiler and found out that sql strings use about 30% of my memory! This is bizarre.
There are a lot of strings like this stored in app memory. This is SQL queries generated by hibernate, note the different numbers and trailing underscores:
select avatardata0_.Id as Id4305_0_,...... where avatardata0_.Id=? for update
select avatardata0_.Id as Id4347_0_,...... where avatardata0_.Id=? for update
Here is the part I can't understand. Why does hibernate have to generate different sql strings with different identifiers like "Id4305_0_" for each query? Why can't it use one query string for all identical queries? Is this some kind of trick to bypass query caching?
I would greatly appreciate if someone would describe me why it happening and how to avoid such resource wasting.
UPDATE
Ok. I found it. I was wrong assuming memory leak, It was my fault. Hibernate is working as intended.
My app created 121(!) SessionFactories in 10 threads, they produced about 2300 instances of SingleTableEntityPersisters. And each SingleTableEntityPersister generates about 15 SQL queries with different identifiers. Hibernate was forced to generate about 345.000 different SQL queries. Everything is fine, nothing weird :)
There is a logic behind the query string that hibernate generates. Its primary aim is to get unique aliases for tables and columns names.
From your query,
select avatardata0_.Id as Id4305_0_,...... where avatardata0_.Id=?
avatardata0_ ==> avatardata is the alias of the table and 0_ is appended to indicate it is the first table in the query. So if it were the second table(or Entity) in the query it should have been shown as avatardata1_. It uses the same logic for the column aliases.
So, this way all the possible conflicts are avoided.
You are seeing theses queries because you have turns on the show_sql flag the configuration. This is intended for the debugging of queries. Once you application started working you are supposed turn it off.
Read more on the API docs here.
I am not much aware of the memory consumption part, but you repeat your tests with the above flag turned off and see if there is any improvement.
Assuming you are using sql server, you might want to check the parameter type declaration for '?', making sure the declaration results in the same, fixed length declaration every time.
Dynamic length parameters would result in separate execution plans for each query. This could possibly comsume a lot of resources. What we see as the same procedure, get's interpreted by sql server as a different query, rendering a separate execution plan.
Thus,
exec myprocedure #p1 varchar(3)='foo'
and
exec myprocedure #p1 varchar(6)='foobar'
would result in different plans. Simply by the fact that the declarations of #p1, differ in size.
There is a lot to know about this behaviour. If the above applies to you, I would recommend you read up on 'parameter sniffing'.
No... you can generate you common query inside the hibernate. The logic behind is to mapping with table and fetch the record from there. It is used common query for all the database. Please create a common query like that :
Example :
select t.Id as Id4305_0_,...... from t where t.Id=?

Avoiding N+One selects and Invalid results from eclipselink with batch read

I'm trying to cut down the number of n+1 selects incurred by my application, the application uses EclipseLink as an ORM and in as many places as possible I've tried to add the batch read hint to queries. In a large number of places in the app I don't always know exactly what relationships I'll be traversing (My view displays fields based on user preferences). At that point I'd like to run one query to populate all of those relationships for my objects.
My dream is to call something like ReadAllRelationshipsQuery(Collection,RelationshipName) and populate all of these items so that later calls to:
Collection.get(0).getMyStuff will already be populated and not cause a db query. How can I accomplish this? I'm willing to write any code I need to but I can't find a way that work with the eclipselink framework?
Why don't I just batch read all of the possible fields and let them load lazily? What I've found is that the batch value holders that implement batch reads don't behave well with the eclipselink cache. If a batch read value holder isn't "evaluated" and ends up in the eclipse link cache it can become stale and return incorrect data (This behavior was logged as an eclipselink bug but rejected...)
edit: I found the link to the bug here: https://bugs.eclipse.org/bugs/show_bug.cgi?id=326197
How do I avoid N+1 selects for objects I already have a reference to?
You have three basic ways to load data into objects from a JPA-based solution. These are:
Load dynamically by object traversal (e.g. myObject.getMyCollection().get()).
Load graphs of objects by prefetching dynamically using JPA QL (e.g. FETCH JOINs as described at the Oracle JPA tutorial )
Load by setting the fetch mode ( Is there a way to change the JPA fetch type on a method? )
Each of these has pros and cons.
Loading dynamically by object transversal will generate more (highly targeted queries). These queries are usually small (not large SQL statements, but may load lots of data) and tend to play nicely with a second level cache, but you can get lots and lots of little queries.
Prefetching with JPA QL will give you exactly what you want, but that assumes that you know what you want.
Setting the fetch mode to EAGER will load lots and lots of data for you automatically, but depending on the configuration and usage this may not actually help much (or may make things a lot worse) as you may wind up dragging a LOT of data from the DB into your app that you didn't expect.
Regardless, I highly recommend using p6spy ( http://sourceforge.net/projects/p6spy/ ) in conjunction with any JPA-based application to understand the effects of your tuning.
Unfortunately, JPA makes some things easy and some things hard - mainly, side-effects of your usage. For example, you might fix one problem by setting the fetch mode to eager, and then create another problem where the eager fetch pulls in too much data. EclipseLink does provide tooling to help sort this out ( EclipseLink Performance Tools )
In theory, if you wanted to you could write a generic JavaBean property walker by using something like Apache BeanUtils. Usually just calling a method like size() on a collection is enough to force it to load (although using a collection batch fetch size might complicate things a bit).
One thing to pay particular attention to is the scope of your session and your use of caches (EclipseLink cache).
Something not clear from your post is the scope of a session. Is a session a one shot affair (e.g. like a web page request) or is it a long running thing (e.g. like a classic client/server GUI app)?
It is very difficult to optimize the retrieval of relationships if you do not know what relationships you require.
If you application is requesting what relationships it wants, then you must know at some level which relationships you require, and should be able to optimize these in your query for the objects.
For an overview of relationship optimization techniques see,
http://java-persistence-performance.blogspot.com/2010/08/batch-fetching-optimizing-object-graph.html
For Batch Fetching, there are three types, JOIN, EXISTS, and IN. The problem you outlined of changes to data affecting the original query for cache batched relationships only applies to JOIN and EXISTS, and only when you have a selection criteria based on updateale fields, (if the query you are optimizing is on id, or all instances you are ok). IN batch fetching does not have this issue, so you can use IN batch fetching for all the relationships and not have this issue.
ReadAllRelationshipsQuery(Collection,RelationshipName)
How about,
Query query = em.createQuery("Select o from MyObject o where o.id in :ids");
query.setParameter(ids, ids);
query.setHint("eclipselink.batch", relationship);
If you know all possible relations and the user preferences, why don't you just dynamically build the JPQL string (or Criteria) before executing it?
Like:
String sql = "SELECT u FROM User u"; //use a StringBuilder, this is just for simplity's sake
if(loadAdress)
{
sql += " LEFT OUTER JOIN u.address as a"; //fetch join and left outer join have the same result in many cases, except that with left outer join you could load associations of address as well
}
...
Edit: Since the result would be a cross product, you should then iterate over the entities and remove duplicates.
In the query, use FETCH JOIN to prefetch relationships.
Keep in mind that the resulting rows will be the cross product of all rows selected, which can easily be more work than the N+1 queries.

Categories