JOOQ slow code generation

JOOQ slow code generation - java

Are there any parameters, which can turn on/off execution of next query during jooq code generation?
SELECT "SYS"."ALL_OBJECTS"."OWNER",
"SYS"."ALL_OBJECTS"."OBJECT_NAME",
"SYS"."ALL_OBJECTS"."OBJECT_ID",
"SYS"."ALL_PROCEDURES"."AGGREGATE"
FROM "SYS"."ALL_OBJECTS"
LEFT OUTER JOIN "SYS"."ALL_PROCEDURES"
ON ( "SYS"."ALL_OBJECTS"."OWNER" =
"SYS"."ALL_PROCEDURES"."OWNER"
AND "SYS"."ALL_OBJECTS"."OBJECT_NAME" =
"SYS"."ALL_PROCEDURES"."OBJECT_NAME")
WHERE ( UPPER ("SYS"."ALL_OBJECTS"."OWNER") IN ( 'MYSCHEMA')
AND "SYS"."ALL_OBJECTS"."OBJECT_TYPE" IN ( 'FUNCTION', 'PROCEDURE'))
ORDER BY "SYS"."ALL_OBJECTS"."OWNER" ASC,
"SYS"."ALL_OBJECTS"."OBJECT_NAME" ASC,
"SYS"."ALL_OBJECTS"."OBJECT_ID" ASC
On database with large number of schemas and objects it tooks about one hour to be executed

One major issue with the query run by jOOQ is the UPPER(OWNER) expression. This was introduced with jOOQ 2.4 (#1418) to prevent misconfigurations where users accidentally use lower case schema names. The feature was based on an erroneous assumption that case-sensitive users are impossible. They are certainly possible (even if rare), so #1418 was wrong. I've created two issues for this problem:
#5989: Fix the performance issue by avoiding functions on the OWNER column
#5990: Re-enact case-sensitive schema names
In the meantime, you have some possible workarounds:
Pre jOOQ 3.8
You can always override the JavaGenerator from jooq-codegen and re-implement some methods including generatePackages() and generateRoutines() to be empty. This way, the relevant code will not be executed at all.
Of course, this means you won't get any generated packages and routines.
Post jOOQ 3.8
There is a new configuration option where you can do the same as above configuratively:
<configuration>
<generator>
<database>
<includeRoutines>false</includeRoutines>
<includePackages>false</includePackages>
...
See also:
https://www.jooq.org/doc/latest/manual/code-generation/codegen-advanced/codegen-config-include-object-types

Related

How to use SUM inside COALESCE in JOOQ

Given below is a gist of the query, which I'm able to run successfully in MySQL
SELECT a.*,
COALESCE(SUM(condition1 or condition2), 0) as countColumn
FROM table a
-- left joins with multiple tables
GROUP BY a.id;
Now, I'm trying to use it with JOOQ.
ctx.select(a.asterisk(),
coalesce(sum("How to get this ?")).as("columnCount"))
.from(a)
.leftJoin(b).on(someCondition)
.leftJoin(c).on(someCondition))
.leftJoin(d).on(someCondition)
.leftJoin(e).on(someCondition)
.groupBy(a.ID);
I'm having a hard time preparing the coalesce() part, and would really appreciate some help.

jOOQ's API is more strict about the distinction between Condition and Field<Boolean>, which means you cannot simply treat booleans as numbers as you can in MySQL. It's usually not a bad idea to be explicit about data types to prevent edge cases, so this strictness isn't necessarly a bad thing.
So, you can transform your booleans to integers as follows:
coalesce(
sum(
when(condition1.or(condition2), inline(1))
.else_(inline(0))
),
inline(0)
)
But even better than that, why not use a standard SQL FILTER clause, which can be emulated in MySQL using a COUNT(CASE ...) aggregate function:
count().filterWhere(condition1.or(condition2))

The Performance Consequences of Parameterizing Constants in PreparedStatement Queries

When using JDBC's PreparedStatements to query Oracle, consider this:
String qry1 = "SELECT col1 FROM table1 WHERE rownum=? AND col2=?";
String qry2 = "SELECT col1 FROM table1 WHERE rownum=1 AND col2=?";
String qry3 = "SELECT col1 FROM table1 WHERE rownum=1 AND col2=" + someVariable ;
The logic dictates that the value of rownum is always a constant (1 in this example). While the value of col2 is a changing variable.
Question 1: Are there any Oracle server performance advantages (query compilation, caching, etc.) to using qry1 where rownum value is parameterized, over qry2 where rownum's constant value is hardcoded?
Question 2: Ignoring non-performance considerations (such as SQL Injections, readability, etc.), are there any Oracle server performance advantages (query compilation, caching, etc.) to using qry2 over qry3 (in which the value of col2 is explicitly appended, not parameterized).

Answer 1: There are no performance advantages to using qry1 (a softcoded query) over qry2 (a query with reasonable bind variables).
Bind variables improve performance by reducing query parsing; if the bind variable is a constant there is no extra parsing to avoid.
(There are probably some weird examples where adding extra bind variables improves the performance of one specific query. Like with any forecasting program, occasionally if you feed bad information to the Oracle optimizer the result will be better. But it's important to understand that those are exceptional cases.)
Answer 2: There are many performance advantages to using qry2 (a query with reasonable bind variables) over qry3 (a hardcoded query).
Bind variables allow Oracle re-use a lot of the work that goes into query parsing (query compilation). For example, for each query Oracle needs to check that the user has access to view the relevant tables. With bind variables that work only needs to be done once for all executions of the query.
Bind variables also allow Oracle to use some extra optimization tricks that only occur after the Nth run. For example, Oracle can use cardinality feedback to improve the second execution of a query. When Oracle makes a mistake in a plan, for example if it estimates a join will produce 1 row when it really produces 1 million, it can sometimes record that mistake and use that information to improve the next run. Without bind variables the next run will be different and it won't be able to fix that
mistake.
Bind variables also allow for many different plan management features. Sometimes a DBA needs to change an execution plan without changing the text of the query. Features like SQL plan baselines, profiles, outlines, and DBMS_ADVANCED_REWRITE will not work if the query text is constantly changing.
On the other hand, there are a few reasonable cases where it's better to hard-code the queries. Occasionally an Oracle feature like partition pruning cannot understand the expression and it helps to hardcode the value. For large data warehouse queries the extra time to parse a query may be worth it if the query is going to run for a long time anyway.
(Caching is unlikely to affect either scenario. Result caching of a statement is rare, it's much more likely that Oracle will cache only the blocks of the tables used in the statement. The buffer cache probably does not care if those blocks are accessed by one statement many times or by many statements one time)

Is it possible to disable jpa hints per particular query?

JPA, and in my particular case eclipselink, generates /*+ FIRST_ROWS */ in case of using query.setFirstResult()/query.setMaxResults():
SELECT * FROM (
SELECT /*+ FIRST_ROWS */ a.*, ROWNUM rnum FROM (
SELECT * FROM TABLES INCLUDING JOINS, ORDERING, etc.) a
WHERE ROWNUM <= 10 )
WHERE rnum > 0;
That forces Oracle to use nested loops instead of hash-joins. In general is has seance, but in my particular case it dramatically decrease performance.
Is it possible to disable hint usage/generation for a particular query?

As #ibre5041 told, FIRST_ROWS hint is deprecated, in context of Oracle, FIRST_ROWS(N) should be used instead of it. In my case neither FIRST_ROW nor FIRST_ROW(N) is actually needed, so in order to tell eclipselink not to use outdated stuff, it's possible to specify oracle version within persistence.xml:
<property name="eclipselink.target-database" value="org.eclipse.persistence.platform.database.oracle.Oracle11Platform" />
After adding this, I got strange error:
Could not initialize class org.eclipse.persistence.platform.database.oracle.Oracle11Platform
However, after I put ojdbcN.jar to domain/lib/ext the error has gone.
As a result, eclipselink generates query without FIRST_ROW hint, and Oracle uses better plan.

QueryDSL Left Join with additional conditions in ON

Is it possible to do the following query in QueryDSL?
SELECT p.*
FROM parts_table p LEFT JOIN inventory_balance_table i ON
(p.part_no = i.part_no
AND i.month = MONTH(CURRENT_DATE)
AND i.year = YEAR(CURRENT_DATE));
Inventory balance stores inventory data for every part number/month/year; I need the only the data for the current year and month.
I've gotten the basic left join down:
QPartsTable qParts = QPartsTable.partsTable;
QInventoryBalance qBalance = QInventoryBalance.inventoryBalance;
JPAQuery q = new JPAQuery(em);
q.from(qParts).leftJoin(qParts.inventoryBalance, qBalance);
q.where(...);
List<Part> list = q.list(qParts);
which makes the correct sql, but only joining on the part number.
The resulting parts are checked for stock availability (among other things). The left join is necessary, because I still need parts that don't have an inventory entry yet (new parts for instance). Left join will get those without a matching inventory balance, but adding month = MONTH(CURRENT_DATE) and so on to where clause of the query removes the rows without an inventory balance (because they don't have year/month data).
For the same reason #Where and #Filter would remove those parts from the resulting parts list and are not applicable. Sadly #Filter and #Where are the only other results I'm getting with a search in Google and here on SO. (Oddly the Filter doesn't even affect the query even if filters are enabled in the session...)
The simplest solution would be my original question: How to turn the above SQL into QueryDSL? In general, is it possible to add more and/or custom conditions to the ON clause of the left join? What are the alternative solutions to this problem?
Thanks in advance!
Update - A follow-up question and an observation: (Perhaps this should be a new question entirely?)
After looking through the docs, it seems the older blogs demonstrating querydsl had the on() function for leftJoin's. Why is this no longer the case?
SQLQuery (or HibernateSQLQuery or some other variety) has the on() function, but leftJoin() accepts RelationalPath<T>, not an EntityPath<T> as JPAQuery does. It seems impossible to cast QClasses to a RelationalPath, so that's probably not the way to go...
Update 2 - We're using 2.9.0. Using on() gives an error, like it doesn't exist...

It is possible to use on() in QueryDSL, including the latest version. JPAQuery also supports on() predicate.
So this can be achieved,
QPartsTable qParts = QPartsTable.partsTable;
QInventoryBalance qBalance = QInventoryBalance.inventoryBalance;
JPAQuery q = new JPAQuery(em);
q.from(qParts).leftJoin(qParts.inventoryBalance, qBalance).on(qBalance.month.eq(yourMonth).and(qBalance.year.eq(yourYear))).list(qParts);
JPAQuery implements JPQLCommonQuery interface, so as others it has all necessary methods.
Here are docs from QueryDSL latest version with left join using on() example.
Update:
on() has been introduced since QueryDsl 3.0.0 version. So for versions below 3.0.0 it is not available.
I'd suggest to upgrade your version at least to 3.0.0, as the API is quite stronger comparing to old versions. Even more, I'd strongly advice to upgrade to the latest stable version (3.6.2), there shouldn't be any problems as new API supports everything as before, with additional features.
Update 2:
As #Cezille07 mentioned in the comment, there is a with() alternative for on(), in older versions. As we see from the issue , with() has been replaced with on() later on.
So for older versions with() does the trick. Here is a usefull link with more details.

Hibernate getting position of a row in a result set

I need to get an equivalent to this SQL that can be run using Hibernate. It doesn't work as is due to special characters like #.
SELECT place from (select #curRow := #curRow + 1 AS place, time, id FROM `testing`.`competitor` JOIN (SELECT #curRow := 0) r order by time) competitorList where competitorList.id=4;
My application is managing results of running competitions. The above query is selecting for a specific competitor, it's place based on his/her overall time.
For simplicity I'll only list the COMPETITOR table structure (only the relevant fields). My actual query involves a few joins, but they are not relevant for the question:
CREATE TABLE competitor {
id INT,
name VARCHAR,
time INT
}
Note that competitors are not already ordered by time, thus, the ID cannot be used as rank. As well, it is possible to have two competitors with the same overall time.
Any idea how I could make this work with Hibernate?

Hard to tell without a schema, but you may be able to use something like
SELECT COUNT(*) FROM testing ts
WHERE ts.score < $obj.score
where I am using the $ to stand for whatever Hibernate notation you need to refer to the live object.

I couldn't find any way to do this, so I had to change the way I'm calculating the position. I'm now taking the top results and am creating the ladder in Java, rather than in the SQL query.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.