big insert query fails Hibernate\Postgresql - java

I'm trying to find out the root cause of failure in existing system. I don't know much about it, but looks like the issue is in inserting big row into Postregsql via Hibernate.
It fails to insert record w/ TEXT field which is about 50-100k size.
Should not be an issue for postgresql itself. But I guess there might be some settings\parameters in hibernate which can affect it. Any suggestion for the search direction?

First I try to look at the exception,
if it's in your local machine or a
server log, to get more clues. Since
you say it's when inserting a row,
maybe you know where it's happening.
Try inserting a row where the text
field has only a few bytes to see if
that works. Maybe the connection is
slow and inserting more than 50k
causes a timeout followed by a
rollback.
Also check out if that insertion
belongs to a much larger transaction
or it's executing on a smaller one.
Try doing that insertion in plain jdbc (just temporarily) to see if that works and rule out connection issues.
If the problem is not in the connection then you can start tweaking Hibernate parameters. Maybe disabling the 2nd cache. The stack exception or a debugging session will be helpful to know what parameters to change.

Related

ScrollableResults with Hibernate/Oracle pulling everything into memory

I want a page of filtered data from an Oracle database table, but I have a query that might return tens of millions of records, so it's not feasible to pull it all into memory. I need to filter records out in a way that cannot be done via SQL, and return back a page of records. In other words, the pagination part must be done after the filtering.
So, I attempted to use Hibernate's ScrollableResults, thinking it would be a way to pull in only chunks at a time and iterate through them. So, I created it:
ScrollableResults results = query.setReadOnly(true)
.setFetchSize(500)
.setCacheable(false)
.scroll();
... and yet, it appears to pull everything into memory (2.5GB pulled in per query). I've seen another question and I've tried some of the suggestions, but most seem MySQL specific, and I'm using an Oracle 19 driver (e.g. Integer.MIN_VALUE is rejected outright as a fetch size in the Oracle driver).
There was a suggestion to use a stateless session (I'm using the EntityManager which has no stateless option), but my thought is that if we don't fetch many records (because we only want the first page of 200 filtered records), why would Hibernate have millions of records in memory anyway, even though we never scrolled over them?
It's clear to me that I don't understand how/why Hibernate pulls things into memory, or how to get it to stop doing so. Any suggestions on how to prevent it from doing so, given the constraints above?
Some things I'm going to try:
Different scroll modes. Maybe insensitive or forward only prevents Hibernate's need to pull everything in?
Clearing the session after we have our page. I'm closing the session (both using close() in the ScrollableResults and the EntityManager), but maybe an explicit clear() will help?
We were scrolling through the entire ScrollableResults to get the total count. This caused two things:
The Hibernate session cached entities.
The ResultSet in the driver kept rows that it has scrolled past.
Fixing this is specific to my case, really, but I did two things:
As we scroll, periodically clear the Hibernate session. Since we use the EntityManager, I had to do entityManager.unwrap(Session.class).clear(). Not sure if entityManager.clear() would do the job or not.
Make the ScrollableResults forward-only so the Oracle driver doesn't have to keep records in memory as it scrolls. This was as simple as doing .scroll(ScrollMode.FORWARD_ONLY). Only possible since we're only moving forward, though.
This allowed us to maintain a smaller memory footprint, even while scrolling through literally every single record (tens of millions).
Why would you scroll through all results just to get the count? Why not just execute a count query?

hibernate named query(select)

We are using hibernate as an ORM in our project along with Spring framework. Today, I faced one issue where one named query (basically a select query) is giving inconsistent results when being called multiple times. Hit to this select query is not in a loop whereas it is from the front end. So, when i am performing a same operation multiple times from the front end, sometimes query is fetching the correct data from the DB but sometimes not.
Example:(Sample code)
Query query = session.getNamedQuery("select debitid from ABCMstrEntity where entitynum=: entitynum and isopen=:Y");
query.set(........);
..
List<Object[]> list = (List<Object[]>)query.list;
In first 4 attempts, i got correct data in the list object.
In my 5th attempt, i got empty list object although data was present in db for the provided inputs. When i checked it using db query logs at DB end, i found that, there was no hit at DB end for my 5th attempt. So it seems, something has gone wrong here.
I also checked for cache related settings in hibernate in my project but we are not caching query results in any of the cache regions. Also, there was no exception in the applications logs.
Please someone help me on this issue in analyzing and fixing it.

Table rows seem to be disapearing

I have a ton of raw html files that I'm parsing and inserting to a MySQL database via a connection in Java.
I'm using "REPLACE INTO" statements and this method:
public void migrate(SomeThread thread) throws Exception{
PreparedStatement threadStatement = SQL.prepareStatement(threadQuery);
thread.prepareThreadStatement(threadStatement);
threadStatement.executeUpdate();
threadStatement.close();
for(SomeThread.Post P : thread.threadPosts){
PreparedStatement postStatement = SQL.prepareStatement(postQuery);
P.preparePostStatement(postStatement);
postStatement.executeUpdate();
postStatement.close();
}
}
I am running 3 separate instances of my program each in its own command prompt, with their own separate directory of htmls to parse and commit.
I'm using HeidiSQL to monitor the database and a funny thing is happening where I'll see that I have 500,000 rows in a table at one point for example, then I'll close HeidiSQL and check back later to find that I now have 440,000 rows. The same thing occurs for the two tables that I'm using.
Both of my tables use a primary key called "id", each of their ID's have their own domain but it's possible their values overlap and are overwriting each other? I'm not sure if this could be an issue because I'd think SQL would differentiate between the table's "local" id values.
Otherwise I was thinking it could be that since I'm running 3 separate instances that each have their connection to the DB, some kind of magic is happening where right as one row is being committed, the execution swaps to another commit statement, displaces the table, then back to the first commit and then some more magic that causes the database to roll back the number of rows collected.
I'm pretty new to SQL so I'm not too sure where to start, if somebody has an idea about what the heck is going on and could point me in the right direction I'd really appreciate it.
Thanks
You might want to use INSERT INTO instead of REPLACE INTO.
Data doesn't disappear.
Here are some tips:
Do you have another thread running that actually deletes entries?
Do other people have access to the database?
Not sure what HeidiSQL may do. To exclude that possibility maybe use MySQL Workbench instead.
Yeah now that I run a COUNT(*) query against my tables I see that all my rows are in fact there.
Most likely the heidiSQL summary page is just a very rough estimate.
Thanks for the suggestion to use workbench pete, I will try it and see if it is better than Heidi as Heidi is freezing up on me on a regular basis.

Can an Oracle Sequence ever be null?

I'm running a Java application with Spring and I am getting an error on one of my insert statements. My error is:
nested exception is java.sql.SQLIntegrityConstraintViolationException:
ORA-01400: cannot insert NULL into ("MY_SCHEMA"."VALIDATION_RESULT"."RESULT_SEQ")
For all the database guys, is there ever a scenario that Oracle would return null from a nextval call? What about if multiple threads are calling it simultaneously?
For any Spring developers, we're using
org.springframework.jdbc.support.incrementer.OracleSequenceMaxValueIncrementer
to handle the sequence. We use the nextLongValue method.
My gut here is telling me that Oracle isn't giving me a null nextval. From everything I've already searched for, that seems impossible. Can anyone confirm?
Confirmed. They do not return NULL. You get an error message.
Oracle sequences actually generate a block of "nextval" objects so threads can quickly access them. You can alter the sequences to create larger readahead numbers of values if it is performance obstacle. The only possibility is if Oracle is seriously broken. Get your DBA to look in the alert log. Errors like ORA-06nn errors are a DBA's nightmare and are the only thing I am aware of that actually breaks objects like sequences.
In this case the DB and probably the DBA, too, are close to DOA. This kind of thing happens once in a career.
I would suspect your code first. Or someone tinking with the sequences - like doing something stupid with ALTER SEQUENCE. i.e., restarting the sequence from one and breaking table constraints. It is also easy to get things screwed up sequence-wise when you export only table from database DEV -> import to database TEST, because the other metadata needs to be brought over as well.

spring jdbc RowCallbackHandler nightmare

I'm having trouble retrieving data from my database using Spring Jdbc. Here's my issue:
I have a getData() method on my DAO which is supposed to return ONE row from the result of some select statement. When invoked again, the getData() method should return the second row in a FIFO-like manner. I'm aiming for having only one result in memory at a time, since my table will get potentially huge in the future and bringing everything to memory would be a disaster.
If I were using regular jdbc code with a result set I could set its fetch size to 1 and everything would be fine. However I recently found out that Spring Jdbc operations via the JdbcTemplate object don't allow me to achieve such a behaviour (as far as I know... I'm not really knowledgeable about the Spring framework's features). I've heard of the RowCallbackHandler interface, and this post in the java ranch said I could somehow expose the result set to be used later (though using this method it stores the result set as many times over as there are rows, which is pretty dumb).
I have been playing with implementing the RowCallbackHandler interface for a day now and I still can't find a way to get it to retrieve one row from my select at a time. If anyone could enlighten me in this matter i'd greatly appreciate it.
JdbcTemplate.setFetchSize(int fetchSize):
Set the fetch size for this JdbcTemplate. This is important for processing large result sets: Setting this higher than the default value will increase processing speed at the cost of memory consumption; setting this lower can avoid transferring row data that will never be read by the application.
Default is 0, indicating to use the JDBC driver's default.
After a lot of searching and consulting with the rest of my team, we have come to the conclusion that this is not the best implementation path for our project. As Boris suggested, a different approach is the way to go. However, I'm doing something different and using SimpleJdbcTemplate instead and splitting my query so it'll fit in memory better. A "status" field in my records table will be responsbile for telling if the record was successfully processed or read, so i know what records to fetch next.
The question if Spring Jdbc is capable of the behaviour i mentioned in my OP is, however, still in the air. If anyone has an answer for that question I'm sure it would help someone else out there.
Cheers!
You can take a different approach. Create a query which will return just IDs of rows that you want to read. Keep this collection of IDs in memory. You really need to have huge data set to consume a lot of memory. Iterate over it and load one by one row referenced by its ID.
We have the same issue:
- Test fetching fetchSize records in raw jdbc Preparestatement works well: when stop Db after fetching a fetchSize of records, the error throw is Jdbc Connection when the resultset.next() get run.
- Test fetchSize with JdbcTemplate:
PreparedStatementSetter preparedStatementSetter = ps -> { ps.setFetchSize(_exportParams.getFetchSize()); };
RowCallbackHandler rowCallbackHandler = _rs -> { //do st here}
this.jdbcTemplate.query(_exportParams.getSqlscript(), preparedStatementSetter, rowCallbackHandler);
After getting first record, we stop the Postgres. The callback record handler can still handle the rest of records without error.

Categories