I am generating reports in my system but some reports are returning a huge number of results. To remedy this I hit the database with a count first, then in my code I check if this count is above a certain threshold (e.g. 2000), then don't generate the report.
This is fine in most cases but some reports have over a million results, this means it takes the count a good few seconds to return a result.
Ideally, what I would like to do is put my threshold (2000) into my sql statement, stop the count if it reaches this value, and return some value (e.g. true or false, 0 or 1, anything) so that I know it has exceeded its limit. Is this possible in sql, so far I cannot find a solution.
Pseudocode: select count(1) from table while count <= threshold
I am working with java, hibernate, sql server 2005.
Any help will be much appreciated.
Regards,
Eamon
I think something like this should work:
select count(1) from (
select top 2000 *
from table
where ...
)
I don't like the idea of running the query twice, once to check the row count and then again to actually get the data. As a result, if you want to limit the report to 2000 rows, just do something like this:
SELECT TOP 2001
...
from...
in the application check if the rowcount is greater than your limit >2000 return an error, or skip displaying it.
try something like limit clause? this page suggests how that could be done in sql server: http://blogs.msdn.com/b/sqlserver/archive/2006/10/25/limit-in-sql-server.aspx
I'm not really aware of equivalent in Hibernate for TOP instruction, but even if there exists first you will have to retrieve the 2000 rows and then count it what probably will be slow as is now. So you can featch allredy 2001 row and then validate it in application.
But I think that what you should optimize is the query by itself.
If in the query does not exists some outer join just specify index column in statement
SELECT count(*) from test1 inner join test2 on test1.id = test2.id;
888 ms
SELECT count(test.id) from test1 inner join test2 on test1.id = test2.id;
88ms
relation 35 rows in test1 match to 1 in test2;
rows 1252080;
Related
In Oracle, what is the the default ordering of rows for a select query if no "order by" clause is specified.
Is it
the order in which the rows were inserted
there is no default ordering at all
none of the above.
According to Tom Kyte: "Unless and until you add "order by" to a query, you cannot say ANYTHING about the order of the rows returned. Well, short of 'you cannot rely on the order of the rows being returned'."
See this question at asktom.com.
As for ROWNUM, it doesn't physically exist, so it can't be "freed". ROWNUM is assigned after a record is retrieved from a table, which is why "WHERE ROWNUM = 5" will always fail to select any records.
#ammoQ: you might want to read this AskTom article on GROUP BY ordering. In short:
Does a Group By clause in an Query gaurantee that the output data will be
sorted on the Group By columns in
order, even if there is NO Order By
clause?
and we said...
ABSOLUTELY NOT,
It never has, it never did, it never
will.
There is no explicit default ordering. For obvious reasons, if you create a new table, insert a few rows and do a "select *" without a "where" clause, it will (very likely) return the rows in the order they were inserted.
But you should never ever rely on a default order happening. If you need a specific order, use an "order by" clause. For example, in Oracle versions up to 9i, doing a "group by" also caused the rows to be sorted by the group expression(*). In 10g, this behaviour does no longer exist! Upgrading Oracle installations has caused me some work because of this.
(*) disclaimer: while this is the behaviour I observed, it was never guaranteed
It has already been said that Oracle is allowed to give you the rows in any order it wants, when you don't specify an ORDER BY clause. Speculating what the order will be when you don't specify the ORDER BY clause is pointless. And relying on it in your code, is a "career limiting move".
A simple example:
SQL> create table t as select level id from dual connect by level <= 10
2 /
Tabel is aangemaakt.
SQL> select id from t
2 /
ID
----------
1
2
3
4
5
6
7
8
9
10
10 rijen zijn geselecteerd.
SQL> delete t where id = 6
2 /
1 rij is verwijderd.
SQL> insert into t values (6)
2 /
1 rij is aangemaakt.
SQL> select id from t
2 /
ID
----------
1
2
3
4
5
7
8
9
10
6
10 rijen zijn geselecteerd.
And this is only after a simple delete+insert. And there are numerous other situations thinkable. Parallel execution, partitions, index organised tables to name just a few.
Bottom line, as already very well said by ammoQ: if you need the rows sorted, use an ORDER BY clause.
You absolutely, positively cannot rely on any ordering unless you specify order by. For Oracle in particular, I've actually seen the exact same query (without joins), run twice within a few seconds of each other, on a table that didn't change in the interim, return a wildly different order. This seems to be more likely when the result set is large.
The parallel execution mentioned by Rob van Wijk probably explains this. See also Oracle's Using Parallel Execution doc.
It is impacted by index ,
if there is index ,it will return a ascending order ,
if there is not any index ,it will return the order inserted .
You can modify the order in which data is stored into the table by INSERT with the ORGANIZATION clause of the CREATE TABLE statement
Although, it should be rownnum (your #2), it really isn't guaranteed and you shouldn't trust it 100%.
I believe it uses Oracle's hidden Rownum attribute.
So your #1 is probably right assuming there were no deletes done that might have freed rownums for later use.
EDIT: As others have said, you really shouldn't rely on this, ever. Besides deletes theres a lot of different conditions that can affect the default sorting behavior.
I have the following prepared statement in java:
with main_select as
(select request_id,rownum iden
from
(select request_id
from queue_requests
where request_status = 0 and
date_requested <= sysdate and
mod(request_id,?) = ?
order by request_priority desc, oper_id, date_requested)
where rownum < ?)
select *
from queue_requests qr, main_select ms
where qr.request_id in ms.request_id
order by ms.iden for update skip locked;
It doesn't execute:
ORA-02014: cannot select FOR UPDATE from view with DISTINCT, GROUP BY, etc.
I'll try to explain why i need all the select statements:
the first (inner) select obtains the data i need
the second one limits the number of lines to a number (i can't put it in the first select, because oracle firstly limits the results and only after orders them, which is not what i want)
the third (outside with) select preserves the order (i tried using 3 nested selects - so, no with clause - but i can't find a way to preserve the order in this case). Also, it should lock the lines in the queue_requests table, but because i selected data from the with clause, it gives the above error.
So, i want to select data from queue_requests, keep the first x lines, preserve the order of the select and lock the lines.
Is there a way to do it?
The problem seems to be, that you want to set a lock on the result of main_select. I would just guess, that you can do select for update in the select in the with clause like:
with main_select as
(select request_id,rownum iden
from (subselect)
where rownum < ?
for update skip locked)
But as I said lucky guess.
I have below the native sql query. i am using Oracle database.
select
*
from
(select
row_.*,
rownum rownumber
from
(select
colmn1,
colmn2,
colmn3,
colmn4,
colmn5,
colmn6,
from
Table5
where
colmn5 In (
'19901','10001'
)
order by
colmn1 ) row_ )
Where
Rownumber <= 50000
and rownumber > 0
Above query returns 50000 records. if i execute above query in sqldeveloper, it takes only 30 seconds but in the spring and hibernate integrated application it takes 15 minutes. How can i improve the performance?
Thanks!
You have two inner selects. An inner select always is a possible source of bad performance, because it might hinder the database from finding the optimal search strategy.
As far ar I can see you use the inner selects only for handling the row number. If you use only the most inner select and handle the row number on java/hibernate level, then you'll get a much better performance.
You only use this select
select colmn1, colmn2, colmn3, colmn4, colmn5, colmn6,
from Table5
where colmn5 In ('19901','10001')
order by colmn1
which, as it does not have any database specialities any more easily can be replaced by an HQL statement and so making your program independent of the used database (Java class and property names should be replaced by the real ones):
from Table5_Class
where colmn5_Prop in ('19901','10001')
order by colmn1_prop
Then you replace your where condition Where Rownumber <= 50000 and rownumber > 0 by the hibernate methods Query.setMaxResults(50000) and Query.setFirstResult(0) (remark: setFirstResult(0) is superfluous as row 0 always is the first one, but I guess you also want to get the next 50000 rows, and then you can use setFirstResult(n)).
If you need the rownumber as a parameter then you can use the index of the resulting List for this.
P. S: I can't tell you why your select is so much faster in the SQL developer than in Hibernate.
How do I implement paging in Hibernate? The Query objects has methods called setMaxResults and setFirstResult which are certainly helpful. But where can I get the total number of results, so that I can show link to last page of results, and print things such as results 200 to 250 of xxx?
You can use Query.setMaxResults(int results) and Query.setFirstResult(int offset).
Editing too: There's no way to know how many results you'll get. So, first you must query with "select count(*)...". A little ugly, IMHO.
You must do a separate query to get the max results...and in the case where between time A of the first time the client issues a paging request to time B when another request is issued, if new records are added or some records now fit the criteria then you have to query the max again to reflect such. I usually do this in HQL like this
Integer count = (Integer) session.createQuery("select count(*) from ....").uniqueResult();
for Criteria queries I usually push my data into a DTO like this
ScrollableResults scrollable = criteria.scroll(ScrollMode.SCROLL_INSENSITIVE);
if(scrollable.last()){//returns true if there is a resultset
genericDTO.setTotalCount(scrollable.getRowNumber() + 1);
criteria.setFirstResult(command.getStart())
.setMaxResults(command.getLimit());
genericDTO.setLineItems(Collections.unmodifiableList(criteria.list()));
}
scrollable.close();
return genericDTO;
you could perform two queries - a count(*) type query, which should be cheap if you are not joining too many tables together, and a second query that has the limits set. Then you know how many items exists but only grab the ones being viewed.
You can do one thing. just prepare Criteria query as per your busness requirement with all Predicates , sorting , searching etc.
and then do as below :-
CriteriaBuilder criteriaBuilder = em.getCriteriaBuilder();
CriteriaQuery<Feedback> criteriaQuery = criteriaBuilder.createQuery(Feedback.class);
//Just Prepare your all Predicates as per your business need.
//eg :-
yourPredicateAsPerYourBusnessNeed = criteriaBuilder.equal(Root.get("applicationName"), applicationName);
criteriaQuery.where(yourPredicateAsPerYourBusnessNeed).distinct(true);
TypedQuery<Feedback> criteriaQueryWithPredicate = em.createQuery(criteriaQuery);
//Getting total Count Here
Long totalCount = criteriaQueryWithPredicate.getResultStream().distinct().count();
Now we have our actual data with us as above with total count , right.
So now we can apply pagination on the data we have in our hand above , as below :-
List<Feedback> feedbackList = criteriaQueryWithPredicate.setFirstResult(offset).setMaxResults(pageSize).getResultList();
Now You can prepare a wrapper with your List return by DB along with the totalCount , startingPageNo that is offset here in this case, page Size etc and can return to your service / controller class.
I am 101 % sure , this will solve your problem, Because I was facing same problem and sorted it out same way.
Thanks- Sunil Kumar Mali
You can just setMaxResults to the maximum number of rows you want returned. There is no harm in setting this value greater than the number of actual rows available. The problem the other solutions is they assume the ordering of records remains the same each repeat of the query, and there are no changes going on between commands.
To avoid that if you really want to scroll through results, it is best to use the ScrollableResults. Don't throw this object away between paging, but use it to keep the records in the same order. To find out the number of records from the ScrollableResults, you can simply move to the last() position, and then get the row number. Remember to add 1 to this value, since row numbers start counting at 0.
I personally think you should handle the paging in the front-end. I know this isn't that efficiƫnt but at least it would be less error prone.
If you would use the count(*) thing what would happen if records get deleted from the table in between requests for a certain page? Lots of things could go wrong this way.
I have basically the same problem outlined in this question, however I am using Microsoft Access as a database instead of MySQL. The result of which is that SQL_CALC_FOUND_ROWS doesn't seem to be available to me. Believe me, I want to switch, but for the moment it is out of the question.
I have a query that aggregates a number of rows, essentially looking for repeat rows based on certain keys, using a group by. It looks something like this:
Select key1, key2, key3, Count(id)
from table
group by key1, key2, key3
having Count(id) > 1
I need to determine the number of rows (or groupings) that query will return.
The database is being accessed through Java, so in theory I could simply run the query, and cycle through it twice, but I was hoping for something faster and preferably SQL based. Any ideas?
MS Access's record count should give you what you need, or am I missing something?
If you need distinct values from keys, try this
SELECT COUNT(*) AS Expr2
FROM (
SELECT DISTINCT [key1] & "-" & [key2] & "-" & [key3] AS Expr1
FROM Table1
) AS SUB;
When you create the Statement object, you can declare it to be scrollable. Then the first thing you do is scroll to the end and get the record number. As you're looking at the last record, this will be the number of records in the result set. Then scroll back to the beginning and process normally. Something like:
Statement st=connection.createStatement(ResultSet.TYPE_SCROLL_INSENSITIVE, ResultSet.CONCUR_READ_ONLY);
ResultSet rs=st.executeQuery(myQueryString);
boolean any=rs.last();
int count = any ? count=getRow() : 0;
... do whatever with the record count ...
rs.first();
while (rs.next())
{
... whatever processing you want to do ...
}
rs.close();
... etc ...
I have no idea what the performance implications of doing this with MS Access will be, whether it can jump directly to the end of the result set or if it will have to sequentially read all the records. Still, it should be faster than executing the query twice.