How to use SUM inside COALESCE in JOOQ - java

Given below is a gist of the query, which I'm able to run successfully in MySQL
SELECT a.*,
COALESCE(SUM(condition1 or condition2), 0) as countColumn
FROM table a
-- left joins with multiple tables
GROUP BY a.id;
Now, I'm trying to use it with JOOQ.
ctx.select(a.asterisk(),
coalesce(sum("How to get this ?")).as("columnCount"))
.from(a)
.leftJoin(b).on(someCondition)
.leftJoin(c).on(someCondition))
.leftJoin(d).on(someCondition)
.leftJoin(e).on(someCondition)
.groupBy(a.ID);
I'm having a hard time preparing the coalesce() part, and would really appreciate some help.

jOOQ's API is more strict about the distinction between Condition and Field<Boolean>, which means you cannot simply treat booleans as numbers as you can in MySQL. It's usually not a bad idea to be explicit about data types to prevent edge cases, so this strictness isn't necessarly a bad thing.
So, you can transform your booleans to integers as follows:
coalesce(
sum(
when(condition1.or(condition2), inline(1))
.else_(inline(0))
),
inline(0)
)
But even better than that, why not use a standard SQL FILTER clause, which can be emulated in MySQL using a COUNT(CASE ...) aggregate function:
count().filterWhere(condition1.or(condition2))

Related

Add limit and offset to query that was created from a String

I have query as String like
select name from employee
and want to limit the number of rows with limit and offset.
Is this possible with jOOQ and how do I do that?
Something like:
dsl.fetch("select name from employee").limit(10).offset(10);
Yes you're close, but you cannot use fetch(sql), because that eagerly executes the query and it will be too late to append LIMIT and OFFSET. I generally don't recommend the approach offered by Sergei Petunin, because that way, you will tell the RDBMS less information about what you're going to do. The execution plan and resource allocations are likely going to be better if you actually use LIMIT and OFFSET.
There are two ways to do what you want to achieve:
Use the parser
You can use DSLContext.parser() to parse your SQL query and then modify the resulting SelectQuery, or create a derived table from that. Creating a derived table is probably a bit cleaner:
dsl.selectFrom(dsl.parser().parse("select name from employee"))
.limit(10)
.offset(10)
.fetch();
The drawback is that the parser will have to understand your SQL string. Some vendor specific features will no longer be available.
The advantage (starting from jOOQ 3.13) is that you will be able to provide your generated code with attached converters and data type bindings this way, as jOOQ will "know" what the columns are.
Use plain SQL
You were already using plain SQL, but the wrong way. Instead of fetching the data eagerly, just wrap your query in DSL.table() and then use the same approach as above.
When using plain SQL, you will have to make sure manually, that the resulting SQL is syntactically correct. This includes wrapping your query in parentheses, and possibly aliasing it, depending on the dialect you're using:
dsl.selectFrom(table("(select name from employee)").as("t"))
.limit(10)
.offset(10)
.fetch();
The best thing you can do with a string query is to create a ResultQuery from it. It allows you to limit the maximum amount of rows fetched by the underlying java.sql.Statement:
create.resultQuery("select name from employee").maxRows(10).fetch();
or to fetch lazily and then scroll through the cursor:
create.resultQuery("select name from employee").fetchLazy().fetch(10);
Adding an offset or a limit to a query is only possible using a SelectQuery, but I don't think there's any way to transform a string query to a SelectQuery in JOOQ.
Actually, if you store SQL queries as strings in the database, then you are already in a non-typesafe area, and might as well append OFFSET x LIMIT y directly to a string-based query. Depending on the complexity of your queries, it might work.

Can I build a jooq query into an offset?

I'm trying to build something like the following query using the jooq api.
select x.*
from x
offset greatest(0, (select count(*) - 1 from x));
by
select(x.fields()).from(x)
.offset(param(greatest(val(0), select(count().sub(1)).from(x).field(0, Integer.class))))
I'm pretty sure I'm using the offset(Param<Integer>) method incorrectly. It seems to be rendering null for the offset. Is building up offsets like this something that jooq can do? (It seems like the offset method is a bit restricted in what it can do, compared to the rest of the jooq api.)
(I know this query without context seems inefficient, but it's actually what I want to be doing.)
Thanks!
I don't think any database allows you to put a non-constant expression in their OFFSET and LIMIT clauses (it is possible in PostgreSQL, see dsmith's comments). In any case, jOOQ doesn't allow you to do it. You must provide either a constant int value, or a bind variable (a Param).
But you don't really need that feature in your case anyway. Your hypothetical syntax ...
select x.*
from x
offset greatest(0, (select count(*) - 1 from x));
Is equivalent to this:
select x.*
from x
order by <implicit ordering> desc
limit 1;
After all, your query seems to be looking for the last row (by some implicit ordering), so why not just make that explicit?

Better to query once, then organize objects based on returned column value, or query twice with different conditions?

I have a table which I need to query, then organize the returned objects into two different lists based on a column value. I can either query the table once, retrieving the column by which I would differentiate the objects and arrange them by looping through the result set, or I can query twice with two different conditions and avoid the sorting process. Which method is generally better practice?
MY_TABLE
NAME AGE TYPE
John 25 A
Sarah 30 B
Rick 22 A
Susan 43 B
Either SELECT * FROM MY_TABLE, then sort in code based on returned types, or
SELECT NAME, AGE FROM MY_TABLE WHERE TYPE = 'A' followed by
SELECT NAME, AGE FROM MY_TABLE WHERE TYPE = 'B'
Logically, a DB query from a Java code will be more expensive than a loop within the code because querying the DB involves several steps such as connecting to DB, creating the SQL query, firing the query and getting the results back.
Besides, something can go wrong between firing the first and second query.
With an optimized single query and looping with the code, you can save a lot of time than firing two queries.
In your case, you can sort in the query itself if it helps:
SELECT * FROM MY_TABLE ORDER BY TYPE
In future if there are more types added to your table, you need not fire an additional query to retrieve it.
It is heavily dependant on the context. If each list is really huge, I would let the database to the hard part of the job with 2 queries. At the opposite, in a web application using a farm of application servers and a central database I would use one single query.
For the general use case, IMHO, I will save database resource because it is a current point of congestion and use only only query.
The only objective argument I can find is that the splitting of the list occurs in memory with a hyper simple algorithm and in a single JVM, where each query requires a bit of initialization and may involve disk access or loading of index pages.
In general, one query performs better.
Also, with issuing two queries you can potentially get inconsistent results (which may be fixed with higher transaction isolation level though ).
In any case I believe you still need to iterate through resultset (either directly or by using framework's methods that return collections).
From the database point of view, you optimally have exactly one statement that fetches exactly everything you need and nothing else. Therefore, your first option is better. But don't generalize that answer in way that makes you query more data than needed. It's a common mistake for beginners to select all rows from a table (no where clause) and do the filtering in code instead of letting the database do its job.
It also depends on your dataset volume, for instance if you have a large data set, doing a select * without any condition might take some time, but if you have an index on your 'TYPE' column, then adding a where clause will reduce the time taken to execute the query. If you are dealing with a small data set, then doing a select * followed with your logic in the java code is a better approach
There are four main bottlenecks involved in querying a database.
The query itself - how long the query takes to execute on the server depends on indexes, table sizes etc.
The data volume of the results - there could be hundreds of columns or huge fields and all this data must be serialised and transported across the network to your client.
The processing of the data - java must walk the query results gathering the data it wants.
Maintaining the query - it takes manpower to maintain queries, simple ones cost little but complex ones can be a nightmare.
By careful consideration it should be possible to work out a balance between all four of these factors - it is unlikely that you will get the right answer without doing so.
You can query by two conditions:
SELECT * FROM MY_TABLE WHERE TYPE = 'A' OR TYPE = 'B'
This will do both for you at once, and if you want them sorted, you could do the same, but just add an order by keyword:
SELECT * FROM MY_TABLE WHERE TYPE = 'A' OR TYPE = 'B' ORDER BY TYPE ASC
This will sort the results by type, in ascending order.
EDIT:
I didn't notice that originally you wanted two different lists. In that case, you could just do this query, and then find the index where the type changes from 'A' to 'B' and copy the data into two arrays.

Hibernate getting position of a row in a result set

I need to get an equivalent to this SQL that can be run using Hibernate. It doesn't work as is due to special characters like #.
SELECT place from (select #curRow := #curRow + 1 AS place, time, id FROM `testing`.`competitor` JOIN (SELECT #curRow := 0) r order by time) competitorList where competitorList.id=4;
My application is managing results of running competitions. The above query is selecting for a specific competitor, it's place based on his/her overall time.
For simplicity I'll only list the COMPETITOR table structure (only the relevant fields). My actual query involves a few joins, but they are not relevant for the question:
CREATE TABLE competitor {
id INT,
name VARCHAR,
time INT
}
Note that competitors are not already ordered by time, thus, the ID cannot be used as rank. As well, it is possible to have two competitors with the same overall time.
Any idea how I could make this work with Hibernate?
Hard to tell without a schema, but you may be able to use something like
SELECT COUNT(*) FROM testing ts
WHERE ts.score < $obj.score
where I am using the $ to stand for whatever Hibernate notation you need to refer to the live object.
I couldn't find any way to do this, so I had to change the way I'm calculating the position. I'm now taking the top results and am creating the ladder in Java, rather than in the SQL query.

Is it possible to use GROUP BY with bind variables?

I want to issue a query like the following
select max(col1), f(:1, col2) from t group by f(:1, col2)
where :1 is a bind variable. Using PreparedStatement, if I say
connection.prepareStatement
("select max(col1), f(?, col2) from t group by f(?, col2)")
I get an error from the DBMS complaining that f(?, col2) is not a GROUP BY expression.
How does one normally solve this in JDBC?
I suggest re-writing the statement so that there is only one bind argument.
This approach is kind of ugly, but returns the result set:
select max(col1)
, f_col2
from (
select col1
, f(? ,col2) as f_col2
from t
)
group
by f_col2
This re-written statement has a reference to only a single bind argument, so now the DBMS sees the expressions in the GROUP BY clause and the SELECT list are identical.
HTH
[EDIT]
(I wish there were a prettier way, this is why I prefer the named bind argument approach that Oracle uses. With the Perl DBI driver, positional arguments are converted to named arguments in the statement actually sent to Oracle.)
I didn't see the problem at first, I didn't understand the original question. (Apparently, several other people missed it too.) But after running some test cases, it dawned on me what the problem was, what the question was working.
Let me see if I can state the problem: how to get two separate (positional) bind arguments to be treated (by the DBMS) as if it were two references to the same (named) bind argument.
The DBMS is expecting the expression in the GROUP BY to match the expression in the SELECT list. But the two expressions are considered DIFFERENT even when the expressions are identical, when the only difference is that each expression references a different bind variable. (We can demonstrate some test cases that at least some DBMS will allow, but there are more general cases that will raise an exception.)
At this point the short answer is, that's got me stumped. The suggestion I have (which may not be an actual answer to the original question) is to restructure the query.
[/EDIT]
I can provide more details if this approach doesn't work, or if you have some other problem figuring it out. Or if there's a problem with performance (I can see the optimizer choosing a different plan for the re-written query, even though it returns the specified result set. For further testing, we'd really need to know what DBMS, what driver, statistics, etc.)
EDIT (eight and a half years later)
Another attempt at a query rewrite. Again, the only solution I come up with is a query with one bind placeholder. This time, we stick it into an inline view that returns a single row, and join that to t. I can see what it's doing; I'm not sure how the Oracle optimizer will see this. We may want (or need) to do an explicit conversion e.g. TO_NUMBER(?) AS param, TO_DATE(?,'...') AS param, TO_CHAR(?) AS param, depending on the datatype of the bind parameter, and the datatype we want to be returned as from the view.)
This is how I would do it in MySQL. The original query in my answer does the join operation inside the inline view (MySQL derived table). And we want to avoid materializing a hughjass derived table if we can avoid it. Then again, MySQL would probably let the original query slide as long as sql_mode doesn't include ONLY_FULL_GROUP_BY. MySQL would also let us drop the FROM DUAL)
SELECT MAX(t.col1)
, f( v.param ,t.col2)
FROM t
CROSS
JOIN ( SELECT ? AS param FROM DUAL) v
GROUP
BY f( v.param ,t.col2)
According to the answer from MadusankaD, within the past eight years, Oracle has added support for reusing the same named bind parameters in the JDBC driver, and retaining equivalence. (I haven't tested that, but if that works now, then great.)
Even though you have issued a query through JDBC driver(using PreparedStatement) like this:
select max(col1), f(:1, col2) from t group by f(:1, col2)
At last JDBC driver replaces these like below query before parsing to the database , even though you have used the same binding variable name in the both places.
select max(col1), f(*:1*, col2) from t group by f(*:2*, col2)
But in oracle this will not be recognized as a valid group by clause.
And also normal JDBC driver doesn't support named bind variables.
For that you can use OraclePreparedStatement class for you connection. That means it is oracle JDBC. Then you can use named bind variables. It will solve your issue.
Starting from Oracle Database 10g JDBC drivers, bind by name is supported using the setXXXAtName methods.
http://docs.oracle.com/cd/E24693_01/java.11203/e16548/apxref.htm#autoId20
Did you try using ? rather than the named bind variables? As well, which driver are you using? I tried this trivial example using the thin driver, and it seemed to work fine:
PreparedStatement ps = con.prepareStatement("SELECT COUNT(*), TO_CHAR(SYSDATE, ?) FROM DUAL GROUP BY TO_CHAR(SYSDATE, ?)");
ps.setString(1, "YYYY");
ps.setString(2, "YYYY");
ps.executeQuery();
In the second case, there are actually two variables - you will need to send them both with the same value.

Categories