I'm trying to build something like the following query using the jooq api.
select x.*
from x
offset greatest(0, (select count(*) - 1 from x));
by
select(x.fields()).from(x)
.offset(param(greatest(val(0), select(count().sub(1)).from(x).field(0, Integer.class))))
I'm pretty sure I'm using the offset(Param<Integer>) method incorrectly. It seems to be rendering null for the offset. Is building up offsets like this something that jooq can do? (It seems like the offset method is a bit restricted in what it can do, compared to the rest of the jooq api.)
(I know this query without context seems inefficient, but it's actually what I want to be doing.)
Thanks!
I don't think any database allows you to put a non-constant expression in their OFFSET and LIMIT clauses (it is possible in PostgreSQL, see dsmith's comments). In any case, jOOQ doesn't allow you to do it. You must provide either a constant int value, or a bind variable (a Param).
But you don't really need that feature in your case anyway. Your hypothetical syntax ...
select x.*
from x
offset greatest(0, (select count(*) - 1 from x));
Is equivalent to this:
select x.*
from x
order by <implicit ordering> desc
limit 1;
After all, your query seems to be looking for the last row (by some implicit ordering), so why not just make that explicit?
Related
Given below is a gist of the query, which I'm able to run successfully in MySQL
SELECT a.*,
COALESCE(SUM(condition1 or condition2), 0) as countColumn
FROM table a
-- left joins with multiple tables
GROUP BY a.id;
Now, I'm trying to use it with JOOQ.
ctx.select(a.asterisk(),
coalesce(sum("How to get this ?")).as("columnCount"))
.from(a)
.leftJoin(b).on(someCondition)
.leftJoin(c).on(someCondition))
.leftJoin(d).on(someCondition)
.leftJoin(e).on(someCondition)
.groupBy(a.ID);
I'm having a hard time preparing the coalesce() part, and would really appreciate some help.
jOOQ's API is more strict about the distinction between Condition and Field<Boolean>, which means you cannot simply treat booleans as numbers as you can in MySQL. It's usually not a bad idea to be explicit about data types to prevent edge cases, so this strictness isn't necessarly a bad thing.
So, you can transform your booleans to integers as follows:
coalesce(
sum(
when(condition1.or(condition2), inline(1))
.else_(inline(0))
),
inline(0)
)
But even better than that, why not use a standard SQL FILTER clause, which can be emulated in MySQL using a COUNT(CASE ...) aggregate function:
count().filterWhere(condition1.or(condition2))
Ok, just to cut it short, I've done the actual JPQL without using any parameter first and it looks like this.
SELECT count(dt)
FROM transaction dt
WHERE dt.transactionType = 'TEST'
AND dt.date
BETWEEN FUNC('TO_DATE','01-2019','mm-yyyy')
AND FUNC('TO_DATE','02-2019','mm-yyyy')
This thing work! But the thing is now I need to make the transactionType and date as a parameter and this is how it looks like
SELECT count(dt)
FROM transaction dt
WHERE dt.transactionType = :transType
AND dt.date
BETWEEN FUNC('TO_DATE',:lastMonth,'mm-yyyy')
AND FUNC('TO_DATE',:nextMonth,'mm-yyyy')
So for :transType it's fine, but inside this FUNC() seems like I shouldnt put the parameter just like that and need some workaround. I've been googling and can't find any result.
The error was like this
You have attempted to set a parameter value using a name of
lastMonth,'mm-yyyy') that does not exist in the query string
As you can see, the parameter inside FUNC() take along the parameter behind it that meant for FUNC(). What did I miss? Enlighten me please.
Make sure you're using setString for the parameter type.
I always had difficulty with named parameters within JPA, depending upon how the query was created - try using ordinal parameters, eg: ?1 and set them by index.
I'd avoid FUNC as it can carry some major overhead if you're not extremely careful.
There's a workaround for this problem.
Initially, the simplified SQL as below:
SELECT count(*)
FROM table tb
WHERE tb.date between to_date('01-2020','mm-yyyy') and to_date('02-2020','mm-yyyy');
And by directly convert the simplified SQL to JPQL, it turns out as such:
SELECT count(tb)
FROM table tb
WHERE tb.date BETWEEN FUNC('TO_DATE','01-2020','mm-yyyy') AND FUNC('TO_DATE','02-2020','mm-yyyy')
But, the JPQL need to be dynamic as the date will not be static, so by using JPQL parameter to ensure this JPQL can be used at any date, instinctively I thought to use as such:
SELECT count(tb)
FROM table tb
WHERE tb.date BETWEEN FUNC('TO_DATE',:fromDate,'mm-yyyy') AND FUNC('TO_DATE',:toDate,'mm-yyyy')
But as my initial question when this thread first started, such JPQL will not work. So how did I found a workaround? Relatively quite simple actually.
Instead of using this to get the ranging date (as from sql wise I use to_date)
WHERE tb.date BETWEEN FUNC('TO_DATE','','') AND FUNC('TO_DATE','','')
I used this
WHERE FUNC('TO_CHAR','','') between (--fromDate) and (--toDate)
Which finally resulted in final working JPQL of
SELECT count(tb)
FROM table tb
WHERE FUNC('TO_CHAR',tb.date,'mm-yyyy') BETWEEN (:fromDate) AND (:toDate)
I need to get an equivalent to this SQL that can be run using Hibernate. It doesn't work as is due to special characters like #.
SELECT place from (select #curRow := #curRow + 1 AS place, time, id FROM `testing`.`competitor` JOIN (SELECT #curRow := 0) r order by time) competitorList where competitorList.id=4;
My application is managing results of running competitions. The above query is selecting for a specific competitor, it's place based on his/her overall time.
For simplicity I'll only list the COMPETITOR table structure (only the relevant fields). My actual query involves a few joins, but they are not relevant for the question:
CREATE TABLE competitor {
id INT,
name VARCHAR,
time INT
}
Note that competitors are not already ordered by time, thus, the ID cannot be used as rank. As well, it is possible to have two competitors with the same overall time.
Any idea how I could make this work with Hibernate?
Hard to tell without a schema, but you may be able to use something like
SELECT COUNT(*) FROM testing ts
WHERE ts.score < $obj.score
where I am using the $ to stand for whatever Hibernate notation you need to refer to the live object.
I couldn't find any way to do this, so I had to change the way I'm calculating the position. I'm now taking the top results and am creating the ladder in Java, rather than in the SQL query.
Let's say I have a basic query like:
SELECT a, b, c FROM x WHERE y=[Z]
In this query, [Z] is a "variable" with different values injected into the query.
Now consider a situation where we want to do the same query with 2 known different values of [Z], say Z1 and Z2. We can make two separate queries:
SELECT a, b, c FROM x WHERE y=Z1
SELECT a, b, c FROM x WHERE y=Z2
Or perhaps we can programmatically craft a different query like:
SELECT a, b, c FROM x WHERE y in (Z1, Z2)
Now we only have one query (1 < 2), but the query construction and result set deconstruction becomes slightly more complicated, since we're no longer doing straightforward simple queries.
Questions:
What is this kind of optimization called? (Is it worth doing?)
How can it be implemented cleanly from a Java application?
Do existing Java ORM technologies help?
What is this kind of optimization called?
I'm not sure if there is a "proper" term for it, but I've heard it called query batching or just plain batching.
(Is it worth doing?)
It depends on:
whether it is worth the effort optimizing the query at all,
the number of elements in the set; i.e. ... IN ( ... ),
the overheads of making a JDBC request versus the costs of query compilation, etc.
But in the right circumstances this is definitely a worthwhile optimization.
How can it be implemented cleanly from a Java application?
It depends on your definition of "clean" :-)
Do existing Java ORM technologies help?
It depends on the specific ORM technology you are talking, but (for example) the Hibernate HQL language supports the constructs that would allow you to do this kind of thing.
An RDBMS can normally return the result of a query with IN in equal or less time than it takes to execute two queries.
If there is no index on column Y, then a full table scan is required. With two queries, two table scans will be performed instead of one.
If there is an index, then the single value in the WHERE clause, or the values in the IN list, are used one at a time to look up the index. When some rows are found for one of the values in the IN list, they are added to the returned result.
So it is better to use the IN predicate from the performance point of view.
When Y represents a column with unique values, then it is easy to decompose the result. Otherwise, there is slightly more work.
I honestly can't say how much of a hit (if any) you will get if you run this two Prepared queries (even using plain JDBC) over combining them with an IN statement.
If you have an array or List of values, you could manually build the prepare statement using JDBC:
// Assuming values is an int[] and conn is a java.sql.Connection
// Also uses Apache Commons StringUtils
StringBuilder query = new StringBuilder("SELECT a, b, c FROM x WHERE y IN (");
query.append(StringUtils.join(Collections.nCopies(values.length, "?"), ',');
query.append(")");
PreparedStatement stmt = conn.prepareStatement(query.toString());
for (int i = 0; i < values.length; i++) {
stmt.setInt(i + 1, values[i]);
}
stmt.execute();
// Get results after this
Note: I haven't actually tested this. In theory, if you used this a lot, you'd generalize this and make it a method.
Note that an "in" (where blah in ( 1, 5, 10 ) ) is the same as writing "where blah = 1 OR blah = 5 OR blah = 10". This is important if you are using, say, Apache Torque which would create lovely prepared statements except in the case of an "in" clause. (That might be fixed by now.)
And the difference in performance that we found between the unprepared in clause and the prepared ORs was huge.
So a number of ORMs handle it, but not all of 'em handle it well. Be sure to examine the queries sent to the database.
And while deconstructing the combined result set from a single query might be more difficult than handling a single result, it's probably a lot easier than trying to combine two result sets from two queries. And probably significantly faster if a lot of duplicates are involved.
i have a table containing 15+ million records in oracle. its sort of a log table which has a created_ts column of type "date" . i have a simple "non-unique" type index on created_ts column.
i have a simple range query :
select * from table1 where created_ts >= ? and created_ts <= ?;
when i run this query from SQLPlus or SQL Developer etc like this :
select * from table1
where created_ts >= TO_DATE( '2009-11-10 00:00:00', 'YYYY-MM-DD HH24:MI:SS')
and created_ts <= TO_DATE( '2009-11-10 23:59:59', 'YYYY-MM-DD HH24:MI:SS');
the query returns within 1-2 second max.
but when I run the exact same query in java over JDBC and set the corresponding "?" params using java.sql.Timestamp object . the query takes long time . Analyzing the oracle process it goes for full table scan and doesnt use the index.
the jdbc driver i am using is ojdbc5 11.1.0.7.0
Can any one please help .. how to create the index correctly so that it uses the index.
My problem was resolved when i used "oracle.sql.DATE" objects to set the bind variables instead of "java.sql.timestamp" . The query used the index and executed almost within 1-2 seconds.
Thanks to all who replied and helped.
But its problematic for me as this solution is DB dependent and my app receives DB connection and query as param and load and process data in a generic way. The DB connection can be of any RDBMS like oracle, mysql, etc.
This is classic behaviour for an implicit datatype conversion. Because the database is having to convert the datatype of the column it cannot use any index on that column.
In your case I suspect this is due to your use of java.sql.Timestamp. Would it be possible to use the equivalent type from the Oracle datatypes package, oracle.sql.Timestamp? Obviously that may have some knock-on effects but I think you should at least test it, to see whether that solves your problem.
The difference may because of bind variables vs. literal values. You are not comparing the same things.
Try this in SQL*Plus:-
explain plan for
select * from table1 where created_ts >= :1 and created_ts <= :2;
set markup html preformat on
set linesize 100
set pagesize 0
select plan_table_output
from table(dbms_xplan.display('plan_table',null,'serial'));
This will show you the plan Oracle will pick when using bind variables. In this scenario, Oracle has to make up a plan before you have provided values for your date range. It does not know if you are selecting only a small fraction of the data or all of it. If this has the same plan (full scan?) as your plan from java, at least you konw what is happening.
Then, you could consider:-
Enabling bind peeking (but only after testing this does not cause anything else to go bad)
Carefully binding literal values from java in a way that does not allow SQL injection
Putting a hint in the statement to indicate it should use the index you want it to.
You should try a hint of the form /*+ USE_INDEX(table_name, index_name) */
My guess is that the optimizer is choosing a full table scan because it sees that as the best option in absence of knowing the bind values.