Loading million rows in hibernate - java

If I want to fetch million rows in hibernate, how would it work? Will hibernate crash? How can I optimize that.

typically you wouldn't use hibernate for this. If you need to do a batch operation, use sql or the hibernate wrappers for batch operations. There is no way loading millions of records is going to end well for your application. Your app with thrash as the gc runs, or possibly crash. there has to be another option.

If you read one/write one it will probably work fine. Are you sure this the way you want to read 1,000,000 rows? It will likely take a while.
If you want all the objects to be in memory at the same time, you might well be challenged.
You can optimize it best, probably, by finding a different way. For example, you can dump from the database using database tools much more quickly than reading with hibernate.
You can select sums, maxes, and counts in the database without returning a million rows over the network.
What are you trying to accomplish, exactly?

For this you would be better off using spring's jdbc tools with a row handler. It will run the query and then perform some action on a row at a time.

Bring only the columns you need. Try it out in a test environment.

You should try looking at the StatelessSession interface and example of which can be found here:
http://mrmcgeek.blogspot.com/2010/09/bulk-data-loading-with-hibernate.html

Related

Any disadvantage to reading using a cursor in Java PostgreSQL?

I wish to read all rows of a large table from PostgreSQL in Java. I am processing the rows one by one in the Java software.
By default the JDBC PostgreSQL driver reads all rows into memory, meaning my program runs out of memory.
The documentation talks of "Getting results based on a cursor" using st.setFetchSize(50); I have implemented that and it works well.
Is there any disadvantage to this approach? If not, I would enable it for all our queries, big and small, or is that a bad idea?
Well, if you have a fetchsize of 50 and you get 1000 results, it will result in 20 round-trips to the database. So no, it's not a good idea to enable it blindly without thinking of the actual queries being run.
A bigger question is why are your ResultSets so big that you run out of memory. Are you only loading data you're going to use and you just don't have a lot of memory, or are there perhaps poorly designed queries that return excessive results.

Retrieving data faster from SQL database with hibernate

My application contains a lot of data in the database.
Everyday we are processing around 60K records.
My problem is, since the data is growing everyday is there a way to make the user generated searches from my application faster as it takes quite a bit of time to load the records on to the UI. I am using Java with Spring and Hibernate.
I am trying to improve the user experience as we are getting lots of complaints from the users about the searches being slow.
Appreciate any help.
There is no simple answer to this. It boils down to looking at your application, its schemas and the queries that are generated, and figuring out where the bottlenecks are. Depending on that, the solution might be:
to add indexes to certain tables,
to redesign parts of the data model or the queries,
to reduce the size of the resultsets you are reading (e.g. to use paging),
to make user queries simpler, or
to do something else.

Performance of mass insert to Oracle database via OJDBC

I have a Java program that is used to insert a lot (750.000) of records in an Oracle database. I am using the OJDBC6 library, with the OCI client. The table to be written to contains 330 columns, of which 8 appear in one or more indexes.
After having tried two approaches, I'm still struggling with some performance issues.
Creating a prepared statement once, filling the parameters for each record and thereafter executing the statement takes 1h29.
Creating a prepared statement once, filling the parameters for each record, adding them to a batch and executing the batch every 500/1000/5000 (I tried several options) processed records takes 0h27.
However, when the same data is mapped to the same tables using an ETL tool like Informatica PowerCenter, it only takes a couple of minutes. I understand that it might be wishful thinking to reach this timings, but I doubt whether no performance can be gained.
Does anyone has an idea about reasonable timings for this action, and how they can be achieved? Any help is appreciated, many thanks in advance!
(A related question: I will have to update a lot of records, too. What would be the most efficient approach: either keeping track of the columns that were changed and creating a record-dependent prepared statement containing only these columns; or always updating all columns, thereby reusing the same prepared statement?)
Another thing to try would be dropping the indexes, inserting the data, then reloading the indexes. Not as easy from Java, but simple enough.
You can use ThreadGroup on Java
http://docs.oracle.com/javase/1.4.2/docs/api/java/lang/ThreadGroup.html
;)

Hibernate Feasibility for Single table database

I have to design a web application to retrieve data from a huge single table with 40 columns and several thousands of rows for select query and few rows/columns for updation.
Can you please suggest me that for faster performance, use of Hibernate is feasible or not as i only have single table and do not have any joins ?
Or should i use jdbc dao ?
database : sql server 2008
java 7
If you use Hibernate right, there's no problem in fetching an arbitrarily large result set. Just avoid from queries (use select ... from ... queries) and use ScrollableResults. If you use plain JDBC, you'll be able to get started quicker because Hibernate needs to be configured first, you need to write the mapping file, etc. but later on it might pay off since the code you write will be much simpler. Hibernate is very good at taking the boilerplate out of client code.
If you want to retrieve several thousand records and pagination is not possible then It might be a performance issue. Because hibernate will create an object against everyone and store it in its persistence context. If you create too many objects, it uses up a lot of memory. For these type of operations JDBC is better. For similar discussion see Hibernate performance issues using huge databases

Sql returning 3 million records and JVM outofmemory Exception

I am connecting oracle db thru java program. The problem is i am getting Outofmemeory exception because the sql is returning 3 million records. I cannot increase the JVM heapsize for some reason.
What is the best solution to solve this?
Is the only option is to run the sql with LIMIT?
If your program needs to return 3 mil records at once, you're doing something wrong. What do you need to do that requires processing 3 mil records at once?
You can either split the query into smaller ones using LIMIT, or rethink what you need to do to reduce the amount of data you need to process.
In my opinion is pointless to have queries that return 3 million records. What would you do with them? There is no meaning to present them to the user and if you want to do some calculations it is better to run more than one queries that return considerably fewer records.
Using LIMIT is one solution, but a better solution would be to restructure your database and application so that you can have "smarter" queries that do not return everything in one go. For example you could return records based on a date column. This way you could have the most recent ones.
Application scaling is always an issue. The solution here will to do whatever you are trying to do in Java as a stored procedure in Oracle PL/SQL. Let oracle process the data and use internal query planners to limit amount of data flowing in an out and possibly causing major latencies.
You can even write the stored procedure in Java.
Second solution will be to indeed make a limited query and process from several java nodes and collate results. Look up map-reduce.
If each record is around 1 kilobyte that means 3gb of data, do you have that amount of memory available for your application?
Should be better if you explain the "real" problem, since OutOfMemory is not your actual problem.
Try this:
http://w3schools.com/sql/sql_where.asp
There could be three possible solutions
1. If retreiving 3million records at once is not necessary.. Use LIMIT
Consider using meaningful where clause
Export database entries into txt or csv or excel format with the tool that oracle provides and use that file for your use..
Cheers :-)
reconsider your where clause. see if you can make it more restrictive.
and/or
use limit
Just for reference, In Oracle queries, LIMIT is ROWNUM
Eg., ... WHERE ROWNUM<=1000
If you get that large a response then take care to process the result set row by row so the full result does not need to be in memory. If you do that properly you can process enormous data sets without problems.

Categories