hibernate named query(select)

hibernate named query(select) - java

We are using hibernate as an ORM in our project along with Spring framework. Today, I faced one issue where one named query (basically a select query) is giving inconsistent results when being called multiple times. Hit to this select query is not in a loop whereas it is from the front end. So, when i am performing a same operation multiple times from the front end, sometimes query is fetching the correct data from the DB but sometimes not.
Example:(Sample code)
Query query = session.getNamedQuery("select debitid from ABCMstrEntity where entitynum=: entitynum and isopen=:Y");
query.set(........);
..
List<Object[]> list = (List<Object[]>)query.list;
In first 4 attempts, i got correct data in the list object.
In my 5th attempt, i got empty list object although data was present in db for the provided inputs. When i checked it using db query logs at DB end, i found that, there was no hit at DB end for my 5th attempt. So it seems, something has gone wrong here.
I also checked for cache related settings in hibernate in my project but we are not caching query results in any of the cache regions. Also, there was no exception in the applications logs.
Please someone help me on this issue in analyzing and fixing it.

Related

How to optimize one big insert with hibernate

For my website, I'm creating a book database. I have a catalog, with a root node, each node have subnodes, each subnode has documents, each document has versions, and each version is made of several paragraphs.
In order to create this database the fastest possible, I'm first creating the entire tree model, in memory, and then I call session.save(rootNode)
This single save will populate my entire database (at the end when I'm doing a mysqldump on the database it weights 1Go)
The save coasts a lot (more than an hour), and since the database grows with new books and new versions of existing books, it coasts more and more. I would like to optimize this save.
I've tried to increase the batch_size. But it changes nothing since it's a unique save. When I mysqldump a script, and I insert it back into mysql, the operation coast 2 minutes or less.
And when I'm doing a "htop" on the ubuntu machine, I can see the mysql is only using 2 or 3 % CPU. Which means that it's hibernate who's slow.
If someone could give me possible techniques that I could try, or possible leads, it would be great... I already know some of the reasons, why it takes time. If someone wants to discuss it with me, thanks for his help.
Here are some of my problems (I think): For exemple, I have self assigned ids for most of my entities. Because of that, hibernate is checking each time if the line exists before it saves it. I don't need this because, the batch I'm executing, is executed only one, when I create the databse from scratch. The best would be to tell hibernate to ignore the primaryKey rules (like mysqldump does) and reenabeling the key checking once the database has been created. It's just a one shot batch, to initialize my database.
Second problem would be again about the foreign keys. Hibernate inserts lines with null values, then, makes an update in order to make foreign keys work.
About using another technology : I would like to make this batch work with hibernate because after, all my website is working very well with hibernate, and if it's hibernate who creates the databse, I'm sure the naming rules, and every foreign keys will be well created.
Finally, it's a readonly database. (I have a user database, which is using innodb, where I do updates, and insert while my website is running, but the document database is readonly and mYisam)
Here is a exemple of what I'm doing
TreeNode rootNode = new TreeNode();
recursiveLoadSubNodes(rootNode); // This method creates my big tree, in memory only.
hibernateSession.beginTrasaction();
hibernateSession.save(rootNode); // during more than an hour, it saves 1Go of datas : hundreads of sub treeNodes, thousands of documents, tens of thousands paragraphs.
hibernateSession.getTransaction().commit();

It's a little hard to guess what could be the problem here but I could think of 3 things:
Increasing batch_size only might not help because - depending on your model - inserts might be interleaved (i.e. A B A B ...). You can allow Hibernate to reorder inserts and updates so that they can be batched (i.e. A A ... B B ...).Depending on your model this might not work because the inserts might not be batchable. The necessary properties would be hibernate.order_inserts and hibernate.order_updates and a blog post that describes the situation can be found here: https://vladmihalcea.com/how-to-batch-insert-and-update-statements-with-hibernate/
If the entities don't already exist (which seems to be the case) then the problem might be the first level cache. This cache will cause Hibernate to get slower and slower because each time it wants to flush changes it will check all entries in the cache by iterating over them and calling equals() (or something similar). As you can see that will take longer with each new entity that's created.To Fix that you could either try to disable the first level cache (I'd have to look up whether that's possible for write operations and how this is done - or you do that :) ) or try to keep the cache small, e.g. by inserting the books yourself and evicting each book from the first level cache after the insert (you could also go deeper and do that on the document or paragraph level).
It might not actually be Hibernate (or at least not alone) but your DB as well. Note that restoring dumps often removes/disables constraint checks and indices along with other optimizations so comparing that with Hibernate isn't that useful. What you'd need to do is create a bunch of insert statements and then just execute those - ideally via a JDBC batch - on an empty database but with all constraints and indices enabled. That would provide a more accurate benchmark.
Assuming that comparison shows that the plain SQL insert isn't that much faster then you could decide to either keep what you have so far or refactor your batch insert to temporarily disable (or remove and re-create) constraints and indices.
Alternatively you could try not to use Hibernate at all or change your model - if that's possible given your requirements which I don't know. That means you could try to generate and execute the SQL queries yourself, use a NoSQL database or NoSQL storage in a SQL database that supports it - like Postgres.
We're doing something similar, i.e. we have Hibernate entities that contain some complex data which is stored in a JSONB column. Hibernate can read and write that column via a custom usertype but it can't filter (Postgres would support that but we didn't manage to enable the necessary syntax in Hibernate).

Paging data with HQL

I have a java web application where I use Hibernate and MySQL.
On a page I have a search form where on submit i create a HQL Query to fetch the results.
I used .setFirstResult() and .setMaxResults() for paging. This works fine, but not when my resultset contains alot of records...
I thought .setMaxResults() would use limit in the actual SQL query, but it appears it doesn't.
So it loads the whole result in memory, and only then uses the value I set in .setMaxResults(), which results in memory problems.
Is this how it works, or am I just doing something wrong?
If so this is just not usable for me.
ps:
With Criteria API it does work, but there I had some other limitations on creating the criteria. Thats why I tried to use HQL instead, but I'm now facing this problem.
So am I right in my observations? Is there another solution?

Hibernate produce different SQL for every query

I've just tested my application under the profiler and found out that sql strings use about 30% of my memory! This is bizarre.
There are a lot of strings like this stored in app memory. This is SQL queries generated by hibernate, note the different numbers and trailing underscores:
select avatardata0_.Id as Id4305_0_,...... where avatardata0_.Id=? for update
select avatardata0_.Id as Id4347_0_,...... where avatardata0_.Id=? for update
Here is the part I can't understand. Why does hibernate have to generate different sql strings with different identifiers like "Id4305_0_" for each query? Why can't it use one query string for all identical queries? Is this some kind of trick to bypass query caching?
I would greatly appreciate if someone would describe me why it happening and how to avoid such resource wasting.
UPDATE
Ok. I found it. I was wrong assuming memory leak, It was my fault. Hibernate is working as intended.
My app created 121(!) SessionFactories in 10 threads, they produced about 2300 instances of SingleTableEntityPersisters. And each SingleTableEntityPersister generates about 15 SQL queries with different identifiers. Hibernate was forced to generate about 345.000 different SQL queries. Everything is fine, nothing weird :)

There is a logic behind the query string that hibernate generates. Its primary aim is to get unique aliases for tables and columns names.
From your query,
select avatardata0_.Id as Id4305_0_,...... where avatardata0_.Id=?
avatardata0_ ==> avatardata is the alias of the table and 0_ is appended to indicate it is the first table in the query. So if it were the second table(or Entity) in the query it should have been shown as avatardata1_. It uses the same logic for the column aliases.
So, this way all the possible conflicts are avoided.
You are seeing theses queries because you have turns on the show_sql flag the configuration. This is intended for the debugging of queries. Once you application started working you are supposed turn it off.
Read more on the API docs here.
I am not much aware of the memory consumption part, but you repeat your tests with the above flag turned off and see if there is any improvement.

Assuming you are using sql server, you might want to check the parameter type declaration for '?', making sure the declaration results in the same, fixed length declaration every time.
Dynamic length parameters would result in separate execution plans for each query. This could possibly comsume a lot of resources. What we see as the same procedure, get's interpreted by sql server as a different query, rendering a separate execution plan.
Thus,
exec myprocedure #p1 varchar(3)='foo'
and
exec myprocedure #p1 varchar(6)='foobar'
would result in different plans. Simply by the fact that the declarations of #p1, differ in size.
There is a lot to know about this behaviour. If the above applies to you, I would recommend you read up on 'parameter sniffing'.

No... you can generate you common query inside the hibernate. The logic behind is to mapping with table and fetch the record from there. It is used common query for all the database. Please create a common query like that :
Example :
select t.Id as Id4305_0_,...... from t where t.Id=?

big insert query fails Hibernate\Postgresql

I'm trying to find out the root cause of failure in existing system. I don't know much about it, but looks like the issue is in inserting big row into Postregsql via Hibernate.
It fails to insert record w/ TEXT field which is about 50-100k size.
Should not be an issue for postgresql itself. But I guess there might be some settings\parameters in hibernate which can affect it. Any suggestion for the search direction?

First I try to look at the exception,
if it's in your local machine or a
server log, to get more clues. Since
you say it's when inserting a row,
maybe you know where it's happening.
Try inserting a row where the text
field has only a few bytes to see if
that works. Maybe the connection is
slow and inserting more than 50k
causes a timeout followed by a
rollback.
Also check out if that insertion
belongs to a much larger transaction
or it's executing on a smaller one.
Try doing that insertion in plain jdbc (just temporarily) to see if that works and rule out connection issues.
If the problem is not in the connection then you can start tweaking Hibernate parameters. Maybe disabling the 2nd cache. The stack exception or a debugging session will be helpful to know what parameters to change.

spring jdbc RowCallbackHandler nightmare

I'm having trouble retrieving data from my database using Spring Jdbc. Here's my issue:
I have a getData() method on my DAO which is supposed to return ONE row from the result of some select statement. When invoked again, the getData() method should return the second row in a FIFO-like manner. I'm aiming for having only one result in memory at a time, since my table will get potentially huge in the future and bringing everything to memory would be a disaster.
If I were using regular jdbc code with a result set I could set its fetch size to 1 and everything would be fine. However I recently found out that Spring Jdbc operations via the JdbcTemplate object don't allow me to achieve such a behaviour (as far as I know... I'm not really knowledgeable about the Spring framework's features). I've heard of the RowCallbackHandler interface, and this post in the java ranch said I could somehow expose the result set to be used later (though using this method it stores the result set as many times over as there are rows, which is pretty dumb).
I have been playing with implementing the RowCallbackHandler interface for a day now and I still can't find a way to get it to retrieve one row from my select at a time. If anyone could enlighten me in this matter i'd greatly appreciate it.

JdbcTemplate.setFetchSize(int fetchSize):
Set the fetch size for this JdbcTemplate. This is important for processing large result sets: Setting this higher than the default value will increase processing speed at the cost of memory consumption; setting this lower can avoid transferring row data that will never be read by the application.
Default is 0, indicating to use the JDBC driver's default.

After a lot of searching and consulting with the rest of my team, we have come to the conclusion that this is not the best implementation path for our project. As Boris suggested, a different approach is the way to go. However, I'm doing something different and using SimpleJdbcTemplate instead and splitting my query so it'll fit in memory better. A "status" field in my records table will be responsbile for telling if the record was successfully processed or read, so i know what records to fetch next.
The question if Spring Jdbc is capable of the behaviour i mentioned in my OP is, however, still in the air. If anyone has an answer for that question I'm sure it would help someone else out there.
Cheers!

You can take a different approach. Create a query which will return just IDs of rows that you want to read. Keep this collection of IDs in memory. You really need to have huge data set to consume a lot of memory. Iterate over it and load one by one row referenced by its ID.

We have the same issue:
- Test fetching fetchSize records in raw jdbc Preparestatement works well: when stop Db after fetching a fetchSize of records, the error throw is Jdbc Connection when the resultset.next() get run.
- Test fetchSize with JdbcTemplate:
PreparedStatementSetter preparedStatementSetter = ps -> { ps.setFetchSize(_exportParams.getFetchSize()); };
RowCallbackHandler rowCallbackHandler = _rs -> { //do st here}
this.jdbcTemplate.query(_exportParams.getSqlscript(), preparedStatementSetter, rowCallbackHandler);
After getting first record, we stop the Postgres. The callback record handler can still handle the rest of records without error.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.