I'm using displaytag to build tables with data from my db. This works well if the requested list isn't that big but if the list size grows over 2500 entries, fetching the result list takes very long (more than 5 min.). I was wondering if this behavior is normal.
How you handle big list / queries which return big results?
This article links to an example app of how to go about solving the problem. Displaytag expects to be passed a full dataset to create paging links and handle sorting. This kind of breaks the idea of paging externally on the data and fetching only those rows that are asked for (as the user pages to them). The project linked in the article describes how to go about setting this type of thing up.
If you're working with a large database, you could also have a problem executing your query. I assume you have ruled this out. If not, you have the SQL as mentioned earlier - I would run it through the DB2 query analyzer to see if there are any DB bottlenecks. The next step up the chain is to run a test of the Hibernate/DAO call in a unit test without displaytag in the mix. Again, from how you've worded things, it sounds like you've already done this.
The Displaytag hauls and stores everything in the memory (the session). Hibernate also does that. You don't want to have the entire DB table contents at once in memory (however, if the slowdown already begins at 2500 rows, it more look like a matter of badly optimized SQL query / DB table; 2500 rows should be peanuts for a decent DB, but OK, that's another story).
Rather create a HTML table yourself with little help of JSTL c:forEach and a shot of EL. Keep one or two request parameters in the background in input type="hidden": the first row to be displayed (firstrow) and eventually the amount of rows to be displayed at once (rowcount).
Then, in your DAO class just do a SELECT stuff FROM data LIMIT firstrow OFFSET rowcount or something like that depending on the DB used. In MySQL and PostgreSQL you can use the LIMIT and/or OFFSET clause like that. In Oracle you'll need to fire a subquery. In MSSQL and DB2 you'll need to create a SP. You can do that with HQL.
Then, to page through the table, just have a bunch buttons which instructs the server side code to in/decrement the firstrow with rowcount everytime. Just do the math.
Edit: you commented that you're using DB2. I've done a bit research and it appears that you can use the UDB OLAP function ROW_NUMBER() for this:
SELECT id, colA, colB, colC
FROM (
SELECT
ROW_NUMBER() OVER (ORDER BY id) AS row, id, colA, colB, colC
FROM
data
) AS temp_data
WHERE
row BETWEEN 1 AND 10;
This example should return the first 10 rows from the data table. You can parameterize this query so that you can reuse it for every page. This is more efficient than querying the entire table in Java's memory. Also ensure that the table is properly indexed.
Related
I have a table A, about 400k rows one day, I want to write java program to back up old data to another table, there are two ways:
use jdbc to fetch data from table A, 500 rows one time for example, then concatenate sql like insert into table B values(value1, value2...),(value1, value2...),... then execute.
use insert into table B select * from A where, about 2~3 million rows.
As somebody said the second way is slower, but I am not sure, so which way is better? Must not crash database.
The 1st one is the realistic option but you dont have to do it by concatenating SQL. Java JDBC already has functions for sending inserts in batches. In a for loop you can keep performing inserts for one row at a time then when you reach your desired batch size call executeBatch() to post it in one big bulk insert.
The value that you'll have to play around with is how many to insert at at time or the batch size. That answer will depend on your hardware so just see what works. The larger the batch size the more units held in memory at anyone time. For really large databases doing this I have still crashed the program BUT NOT the database. It was only when I found the right batch size for my system did everything work out. For me 5000 was fine. But running it many many times for over a million rows the database was fine. But again this depends on the database as well. But you are on the correct path.
https://www.tutorialspoint.com/jdbc/jdbc-batch-processing.htm
I'am having 6 reports in my web application. All the reports using huge queries around 15 table joins and also query results looping again to do some calculations before present the report to user.
This take long time to load a report. I am using MySQL with Java
What is the best way to fix this issue?
If caching is good what are the available options for that?
I'am planing to create a table and insert all the required data to that table, then reports can access to that table, if it's possible what is the best way to load data to that table?
Can MongoDb or other NoSql DB fix this issue?
Or is there any standard way to do these kind of things?
Use an Explain in front of your query and figure out if your Joins are using any indexes. Hopefully you are usually joining your tables in a similar way, so if you add the right indexes, it should speed it up quite a bit.
For example, if you had,
Select carrots FROM veggies
JOIN fruits ON fruits.color = veggies.color
WHERE veggies.weight=.5
You would want to add an index to the color column in the fruits table, and an index to the weight column of the veggies table.
I have a problem regarding the Resultset of a large database. (MySQLDB, Java 1.7)
The task is to perform a transformation of all the entries of one column into another database.
(e.g. divide every number by three and write them into another database)
As the database contains about 70 columns and a few million rows, my first approach would have been to get a SELECT * and parse the Resultset by columns.
Unfortunately I found no way to parse it this way, as the designated way intends to go through it row by row (while(rs.next()) {} etc).
I don't like this way, as it would create 70 large arrays, I would have had only one per time to reduce memory usage.
So here are my main questions:
Is there a way?
Should I either create a query for every column and parse them (one array at a time but 70 queries) or
Should I just get the whole ResultSet and parse it row by row, writing them into 70 arrays?
Greetings and thanks in advance!
Why not just page your queries ? Pull out 'n' rows at a time, perform the transformation, and then write them into the new database.
This means you don't pull everything up in one query/iteration and then write the whole lot in one go, and you don't have the inefficiencies of working row-by-row.
My other comment is perhaps this is premature optimisation. Have you tried loading the whole dataset, and seeing how much memory it would take. If it's of the order of 10's or even 100's of megs, I would expect the JVM to handle that easily.
I'm assuming your transformation needs to be done in Java. If you can possibly do it in SQL, then doing it entirely within the database is likely to be even more efficient.
Why don't you do it with mysql only.
use this query :
create table <table_name> as select <column_name_on_which_you_want_transformation>/3 from <table name>;
There's a DB that contains approximately 300-400 records. I can make a simple query for fetching 30 records like:
SELECT * FROM table
WHERE isValidated = false
LIMIT 30
Some more words about content of DB table. There's a column named isValidated, that can (as you correctly guessed) take one of two values: true or false. After a query some of the records should be made validated (isValidated=true). It is approximately 5-6 records from each bunch of 30 records. Correspondingly after each query, I will fetch the records (isValidated=false) from previous query. In fact, I'll never get to the end of the table with such approach.
The validation process is made with Java + Hibernate. I'm new to Hibernate, so I use Criterion for making this simple query.
Is there any best practices for such task? The variant with adding a flag-field (that marks records which were fetched already) is inappropriate (over-engineering for this DB).
Maybe there's an opportunity to create some virtual table where records that were already processed will be stored or something like this. BTW, after all the records are processed, it is planned to start processing them again (it is possible, that some of them need to be validated).
Thank you for your help in advance.
I can imagine several solutions:
store everything in memory. You only have 400 records, and it could be a perfectly fine solution given this small number
use an order by clause (which you should do anyway) on a unique column (the PK, for example), store the ID of the last loaded record, and make sure the next query uses where ID > :lastId
If we use the Limit clause in a query which also has ORDER BY clause and execute the query in JDBC, will there be any effect in performance? (using MySQL database)
Example:
SELECT modelName from Cars ORDER BY manuDate DESC Limit 1
I read in one of the threads in this forum that, by default a set size is fetched at a time. How can I find the default fetch size?
I want only one record. Originally, I was using as follows:
SQL Query:
SELECT modelName from Cars ORDER BY manuDate DESC
In the JAVA code, I was extracting as follows:
if(resultSett.next()){
//do something here.
}
Definitely the LIMIT 1 will have a positive effect on the performance. Instead of the entire (well, depends on default fetch size) data set of mathes being returned from the DB server to the Java code, only one row will be returned. This saves a lot of network bandwidth and Java memory usage.
Always delegate as much as possible constraints like LIMIT, ORDER, WHERE, etc to the SQL language instead of doing it in the Java side. The DB will do it much better than your Java code can ever do (if the table is properly indexed, of course). You should try to write the SQL query as much as possibe that it returns exactly the information you need.
Only disadvantage of writing DB-specific SQL queries is that the SQL language is not entirely portable among different DB servers, which would require you to change the SQL queries everytime when you change of DB server. But it's in real world very rare anyway to switch to a completely different DB make. Externalizing SQL strings to XML or properties files should help a lot anyway.
There are two ways the LIMIT could speed things up:
by producing less data, which means less data gets sent over the wire and processed by the JDBC client
by potentially having MySQL itself look at fewer rows
The second one of those depends on how MySQL can produce the ordering. If you don't have an index on manuDate, MySQL will have to fetch all the rows from Cars, then order them, then give you the first one. But if there's an index on manuDate, MySQL can just look at the first entry in that index, fetch the appropriate row, and that's it. (If the index also contains modelName, MySQL doesn't even need to fetch the row after it looks at the index -- it's a covering index.)
With all that said, watch out! If manuDate isn't unique, the ordering is only partially deterministic (the order for all rows with the same manuDate is undefined), and your LIMIT 1 therefore doesn't have a single correct answer. For instance, if you switch storage engines, you might start getting different results.