Hibernate Native Query bad Performance and struggling to convert to JQPL - java

I wrote this SQL Query and using it as a native query in hibernate.
#Query(
value = "SELECT DISTINCT tp.* FROM TWITTER_POST AS tp " +
"JOIN TWITTER_LIST AS tl " +
"ON tl.owner_id = ?1 " +
"JOIN REL_TWITTER_LIST__ACCOUNTS_TRACKED_BY_LIST AS atbl " +
"ON tl.id = atbl.twitter_list_id " +
"JOIN TWITTER_ACCOUNT AS ta " +
"ON ta.id = atbl.accounts_tracked_by_list_id " +
"LEFT OUTER JOIN REL_TWITTER_POST__TWITTER_USERS_HIDING_POST uhp " +
"ON tp.id = uhp.twitter_post_id " +
"AND uhp.TWITTER_USERS_HIDING_POST_ID = ?1 " +
"WHERE uhp.twitter_post_id is NULL AND ta.id = tp.author_id",
countQuery = "SELECT DISTINCT count(tp.*) FROM TWITTER_POST AS tp " +
"JOIN TWITTER_LIST AS tl " +
"ON tl.owner_id = ?1 " +
"JOIN REL_TWITTER_LIST__ACCOUNTS_TRACKED_BY_LIST AS atbl " +
"ON tl.id = atbl.twitter_list_id " +
"JOIN TWITTER_ACCOUNT AS ta " +
"ON ta.id = atbl.accounts_tracked_by_list_id " +
"LEFT OUTER JOIN REL_TWITTER_POST__TWITTER_USERS_HIDING_POST uhp " +
"ON tp.id = uhp.twitter_post_id " +
"AND uhp.TWITTER_USERS_HIDING_POST_ID = ?1 " +
"WHERE uhp.twitter_post_id is NULL AND ta.id = tp.author_id",
nativeQuery = true
)
Page<TwitterPost> findAllNonHiddenForListsFromTwitterAccountId(Long twitterAccountId, Pageable pageable);
I noticed that the query executes very slowly when I'm running it through hibernate as opposed to a SQL tool. I assumed it was because I am using a native query as opposed to JQPL, which (from what I read) immediately does caching and pagination without requiring a definition for "count". Trying to convert it to JQPL failed, because I cannot find a good tutorial for more complicated queries on JQPL across join tables.
#Query(
value = "SELECT DISTINCT twitterPost " +
"FROM TwitterPost twitterPost " +
"JOIN TwitterList twitterList " +
"ON twitterList.owner.id = ?1 " +
"JOIN TwitterAccount tweetAuthorFromList " +
"ON tweetAuthorFromList IN twitterList.accountsTrackedByLists " +
"WHERE twitterPost.author = tweetAuthorFromList " +
"AND twitterList.owner NOT IN twitterPost.twitterUsersHidingPosts"
)
Page<TwitterPost> findAllNonHiddenPostsFromListsForTwitterAccountId(Long twitterAccountId, Pageable pageable);
Apparently my Syntax is off
org.hibernate.exception.SQLGrammarException: could not prepare
statement
but the compiler only shows me problems with the generated SQL, not the JQPL so I'm left in the dark.
Also checked for typical bad performance culprits i.e. eager fetching of entities which I set to lazy everywhere.
Any help regarding whether my performance problem assumptions are correct, or converting the query, are highly appreciated - thanks in advance!

There are many things that are wrong here:
Using SELECT DISTINCT with JOIN indicates that you should have used a Semi Join instead.
The ON tl.owner_id = ?1 is done for filtering, not for the projection, hence you are better off doing an EXISTS query.
Assuming why the query runs slow instead of profiling it. The reason why it runs faster in the DB tool is that DB tools usually truncate the result set while Spring Data consumes the entire result set. Or, if you run EXPLAIN, the output might come from the Optimizer without even running the query.
So, here's what you can do:
Use Semi Joins instead of Joins for filtering.
Use Blaze Persistence to write better entity queries dynamically.
Configure Statement Caching at the JDBC Driver level.
Use the slow query log to log the execution plan when the query is slower than N seconds.

Related

Spring JPA Query is not recognizing Spatial Types

I'm trying to make a native query request through spring JPA Query annotation using Spacial data types.
The query works perfectly when asked to execute through the console or even in the database.
But when he is asked to be used through Spring.
Does anyone know what I'm doing wrong? or there is a better way to achieve the same result (leaving all the calculations DB sided)?
Thank you in advance
This is the query I'm trying to execute. Again, it works through console but fails to execute through spring boot request
#Query(value = "SELECT TOP 1 * FROM Vehicles v " +
"JOIN Bikelots l ON l.BikeLotId = v.BikeLotId " +
"JOIN BikeTypes b ON b.BikeTypeId = l.BikeTypeId " +
"WHERE b.BikeTypeId = ?1 " +
"ORDER BY " +
"Geography::STGeomFromText(v.Point.MakeValid().STAsText(),4326) " +
".STDistance(Geography::STGeomFromText(Geometry::Point(?2,?3,4326).MakeValid().STAsText(),4326)) "
, nativeQuery = true)
com.microsoft.sqlserver.jdbc.SQLServerException: Incorrect syntax near 'Geography:'.
Im using this dialect in application.properties
spring.jpa.database-platform=org.hibernate.spatial.dialect.sqlserver.SqlServer2008SpatialDialect
The given error was caused because jpa hibernate recognizes the character ":" as a placeholder for an upcoming variable.
By placing the query in a String variable, then adding "\\" before each ":" and assigning the string to the value of #Query, solved the problem. See code for example
String query = "SELECT TOP 1 * FROM Vehicles v " +
"JOIN Bikelots l ON l.BikeLotId = v.BikeLotId " +
"JOIN BikeTypes b ON b.BikeTypeId = l.BikeTypeId " +
"WHERE b.BikeTypeId = ?1 " +
"ORDER BY " +
"Geography\\:\\:STGeomFromText(v.Point.MakeValid().STAsText(),4326) " +
".STDistance(Geography\\:\\:STGeomFromText(Geometry\\:\\:Point(?2,?3,4326).MakeValid().STAsText(),4326)) ";
#Query(value = query, nativeQuery = true)

Querying two tables in a database with Java

I have been defeated by the great SQL boss and am now requesting assistance.
Ive removed spaces in table names to avoid confusion
Anyways, I have two tables Orders and Order Details. I need to query a few columns from both. So far, I can query Orders just fine, but when it comes to querying Order Details, or the two together, I get errors.
My Question is this: How do I query two tables?
(note: semicolon is at the bottom, imagine it's there)
Here's what works so far on Orders:
String queryString = "select `Order Date`, Freight "
+ "from Orders "
+ "where Orders.`Order ID` = ? "
Here's my attempt to just grab one column from Order Details and the error to follow
String queryString = "select Product "
+ "from `Order Details` "
+ "where `Order Details`.`Order ID` = ? "
net.ucanaccess.jdbc.UcanaccessSQLException: UCAExc:::4.0.1 user lacks privilege or object not found: ORDER DETAILS.ORDER ID
at net.ucanaccess.jdbc.UcanaccessConnection.prepareStatement(UcanaccessConnection.java:528)
Here's my attempt to grab both at once and the error to follow
String queryString = "select `Order Date`, Freight, Product "
+ "from Orders, `Order Details` "
+ "where Orders.`Order ID` = ? "
net.ucanaccess.jdbc.UcanaccessSQLException: UCAExc:::4.0.1 user lacks privilege or object not found: PRODUCT
at net.ucanaccess.jdbc.UcanaccessConnection.prepareStatement(UcanaccessConnection.java:528)
Here's the above attempt with an extra line at the bottom combining them (I don't know what this does), but it alters the error.
String queryString = "select `Order Date`, Freight, Product "
+ "from Orders, `Order Details` "
+ "where Orders.`Order ID` = ? "
+ "and Orders.`Order ID` = `Order Details`.`Order ID`"
net.ucanaccess.jdbc.UcanaccessSQLException: UCAExc:::4.0.1 user lacks privilege or object not found: ORDER DETAILS.ORDER ID
at net.ucanaccess.jdbc.UcanaccessConnection.prepareStatement(UcanaccessConnection.java:528)
Putting the table name in quotes doesn't work in any SQL-Server i know without changing some configuration.
The correct way is using []:
MSSQL-Example:
SELECT * FROM [Order Details]
Your query may look like this:
String queryString = "SELECT Product "
+ "FROM [Order Details] "
+ "WHERE `Order ID` = ? "
But i would suggest to work without whitspaces within any identifier.
Read about JOIN statements, this will allow you to work with two tables. Try use something like “SELECT your_columns FROM orders o JOIN orderDetails od ON o.id = od.order_id”.
Errors like “object not found” means you didn’t create table. Wish it’ll help.

SELECT MAX of a group except one value

I am working on Spark SQL and I am trying to get the records using following queries:
/*Select all open tasks which are not unscheduled*/
Dataset<Row> scheduledOpenTasks = sqlContext.sql(
"SELECT * "
+ "FROM OpenTaskTable "
+ "WHERE due_date < cast('" + unscheduledDate + "' as timestamp)");
scheduledOpenTasks.createOrReplaceTempView("ScheduledOpenTaskTable");
/*Select scheduled tasks with max due_date for each csg_order_id*/
Dataset<Row> scheduledTasks = sqlContext.sql(
"SELECT TS1.* from ScheduledOpenTaskTable AS TS1 "
+ "INNER JOIN "
+ " (SELECT csg_order_id, MAX(due_date) AS MaxDD"
+ " FROM ScheduledOpenTaskTable"
+ " GROUP BY csg_order_id) AS TS2 "
+ "ON TS1.csg_order_id = TS2.csg_order_id AND TS1.due_date = TS2.MaxDD");
The unscheduled _date has value 4444-12-30.
In the OpenTaskTable, each csg_order_id can have multiple due_dates including unscheduled_date. I need the csg_order_ids with corresponding highest due_dates except unscheduled_date.
Now, with first query, I am removing all the records which have due_date as unscheduled_date. In second query, I am retrieving all the records with max due_date for each csg_order_id.
Now comes the problem: is there any way to combine these queries as one?
Well, after struggling for a while, finally found a way to combine the above two queries like this:
sqlContext.sql("SELECT OT1.* from OpenTaskTable AS OT1 INNER JOIN "
+ "(SELECT OT2.csg_order_id, MAX(OT2.due_date) AS MaxDD FROM "
+ "(SELECT csg_order_id, due_date from OpenTaskTable WHERE due_date < cast('"+unscheduledDate+"' as timestamp)) AS OT2 "
+ "GROUP BY OT2.csg_order_id) AS OT3 "
+ "ON OT1.csg_order_id = OT3.csg_order_id AND OT1.due_date = OT3.MaxDD");
Explanation:
Previously, in the first query, I was retrieving data from OpenTaskTable and then feeding it to the second query. Logically, in the second query also, I am just applying more filters over the retrieved data. At the end, we are trying to get all the attributes from OpenTaskTable only.
So, for this solution I simply used the first query, as the innermost query, and then selected MAX over the records grouped by csg_order_id. And, for the outermost query, just performed an inner join to get all matching csg_order_id records from OpenTaskTable.

JPQL Create "Dynamic" Query to execute in repository

Edit-
I'll add the use case to clear up the function of this.
The user will select two dates - a start date and an end date - these are then passed on and used to select the tables (each year has its own table). In one use case where the two given dates lie in the same year it's a simple query on that table alone.
However, if the two dates are different years I will need to join all tables (so 2011-2013 will be three tables connected, to search through) and thus, I want a dynamic fix to this. I know building up a query like below is against security - just thought something similar would work. As the system will get new tables each year I also dont want to have to manually add however many new queries for each case (2011-2016, 2014-2018, 2011-2019.. etc)
I have a question about whether it is possible to create a dynamic query as a String like below and then pass that through to service -> repository, and use that as a query?
for (int i = 0; i < yearCondition; i++) {
if (i == 0) {
query += "SELECT md.Device_ID, l.locationRef, " + reportRunForm.getStartDate() + " as 'From Date', "
+ reportRunForm.getEndDate() + " as 'To Date' "
+ "from mData.meterdata" + iDateStart.substring(0, 4)
+ " join MOL2.meters m on device_ID = m.meterUI "
+ "join MOL2.locations l on m.locationID = l.locationID "
+ "join MOL2.meterreg mr on m.meterID = mr.meterID "
+ "where mr.userID = ?1";
}
query += "UNION SELECT md.Device_ID, l.locationRef, " + reportRunForm.getStartDate() + " as 'From Date', "
+ reportRunForm.getEndDate() + " as 'To Date' "
+ "from mData.meterdata" + (Integer.parseInt(iDateStart.substring(0, 4))+i)
+ " join MOL2.meters m on device_ID = m.meterUI "
+ "join MOL2.locations l on m.locationID = l.locationID "
+ "join MOL2.meterreg mr on m.meterID = mr.meterID "
+ "where mr.userID = ?1";
}
I may have the wrong idea with how this works, and I know I could create and persist a query through entitymanager, but wanted to know whether doing it through the repository would be possible?
My thought was I'd build up the query like above, pass it through to service and then to repository, and bind it as value in #Query annotation but this doesn't seem possible. I'm likely approaching this wrong so any advice would be appreciated.
Thanks.
Edit -
Had a goof. Understand doing it at all like that is stupid, an approach to build up something similar is what I'm looking for that is still secure.
Maybe this annotations before your POJO can help
#org.hibernate.annotations.Entity(dynamicInsert = true)
for example two tables district and constituency ...
Dynamic query
query += "select *from constituency c where 1=1";
if(constituencyNumber!=null)
query +=" and c.constituency_number like '"+constituencyNumber+"%'";
query += " group by c.district_id";
OR
select *from constituency c where (c.constituency_number is null or c.constituency_number like '1%') group by c.district_id;

SQL Query Will Not Execute SQLite, Java

Problem Synopsis:
When attempting to execute a SQL query in Java with a SQLite Database, the SQL statement fails to return from the execute() or executeQuery() method. In other words, the system "hangs" when executing this SQL statement.
Question:
What am I doing wrong to explain why the ResultSet never "returns?"
TroubleShooting
I tried to narrow the problem and the problem seems to be with the Java execute() or executeQuery(). A ResultSet never seems to return. For example, I tried executing exactly the same query directly in SQLite (that is, using a SQLite DB manager). The query (outside Java) executes in about 5ms and returns the valid result set.
NOTE: No exception is thrown. The system merely seems to "hang" and becomes unresponsive until a manual kill. (waiting more than 10 minutes.)
Code:
I heavily edited this code to make the problem simpler to see. (In production, this uses Prepared Statements. But, the error occurs in both methods--straight Statement and prepared Statement versions.)
Basically, the SELECT returns a single DB item so the user can review that item.
Statement st = conn.createStatement() ;
ResultSet rs = st.executeQuery("SELECT DISTINCT d1.id, d1.sourcefullfilepath, " +
"d1.sourcefilepath, d1.sourcefilename, d1.classificationid, d1.classid, " +
"d1.userid FROM MatterDataset, (SELECT MatterDataset.id, " +
"MatterDataset.sourcefullfilepath, MatterDataset.sourcefilepath, " +
"MatterDataset.sourcefilename, MatterDataset.matterid , " +
"DocumentClassification.classificationid, DocumentClassification.classid," +
" DocumentClassification.userid FROM MatterDataset " +
"LEFT JOIN DocumentClassification ON " +
"DocumentClassification.documentid = Matterdataset.id " +
"WHERE ( DocumentClassification.classid = 1 OR " +
"DocumentClassification.classid = 2 ) AND " +
"DocumentClassification.userid < 0 AND " +
"MatterDataset.matterid = \'100\' ) AS d1 " +
"LEFT JOIN PrivilegeLog ON " +
"d1.id = PrivilegeLog.documentparentid AND " +
"d1.matterid = PrivilegeLog.matterid " +
"WHERE PrivilegeLog.privilegelogitemid IS NULL " +
"AND MatterDataset.matterid = \'100\' " +
"ORDER BY d1.id LIMIT 1 ;") ;
Configuration:
Java 6,
JDBC Driver = Xerial sqlite-jdbc-3.7.2,
SQLite 3,
Windows
Update
Minor revision: as I continue to work with this, adding a MIN(d1.id) to the beginning of the SQL statement at least returns a ResultSet (rather than "hanging"). But, this is not really what I wanted as the MIN obviates the LIMIT function.
Statement st = conn.createStatement() ;
ResultSet rs = st.executeQuery("SELECT DISTINCT MIN(d1.id), d1.id,
d1.sourcefullfilepath, " +
"d1.sourcefilepath, d1.sourcefilename, d1.classificationid, d1.classid, " +
"d1.userid FROM MatterDataset, (SELECT MatterDataset.id, " +
"MatterDataset.sourcefullfilepath, MatterDataset.sourcefilepath, " +
"MatterDataset.sourcefilename, MatterDataset.matterid , " +
"DocumentClassification.classificationid, DocumentClassification.classid," +
" DocumentClassification.userid FROM MatterDataset " +
"LEFT JOIN DocumentClassification ON " +
"DocumentClassification.documentid = Matterdataset.id " +
"WHERE ( DocumentClassification.classid = 1 OR " +
"DocumentClassification.classid = 2 ) AND " +
"DocumentClassification.userid < 0 AND " +
"MatterDataset.matterid = \'100\' ) AS d1 " +
"LEFT JOIN PrivilegeLog ON " +
"d1.id = PrivilegeLog.documentparentid AND " +
"d1.matterid = PrivilegeLog.matterid " +
"WHERE PrivilegeLog.privilegelogitemid IS NULL " +
"AND MatterDataset.matterid = \'100\' " +
"ORDER BY d1.id LIMIT 1 ;") ;
What a messy SQL statement (sorry)! I don't know SQLite, but why not simplify to:
SELECT DISTINCT md.id, md.sourcefullfilepath, md.sourcefilepath, md.sourcefilename,
dc.classificationid, dc.classid, dc.userid
FROM MatterDataset md
LEFT JOIN DocumentClassification dc
ON dc.documentid = md.id
AND (dc.classid = 1 OR dc.classid = 2 )
AND dc.userid < 0
LEFT JOIN PrivilegeLog pl
ON md.id = pl.documentparentid
AND md.matterid = pl.matterid
WHERE pl.privilegelogitemid IS NULL
AND md.matterid = \'100\'
ORDER BY md.id LIMIT 1 ;
I was uncertain whether you wanted to LEFT JOIN or INNER JOIN to DocumentClassification (using LEFT JOIN and then put requirements on classid and userid in the WHERE statement is - in my opinion - contradictory). If DocumentClassification has to exist, then change to INNER JOIN and put the references to classid and userid into the WHERE clause, if DocumentClassification may or may not exist in your result set, then keep the query as I suggested above.
I went back and started over. The SQL syntax, while it worked outside Java, simply seemed too complex for the JDBC driver. This cleaned-up revision seems to work:
SELECT DISTINCT
MatterDataset.id, MatterDataset.sourcefullfilepath, MatterDataset.sourcefilepath,
MatterDataset.sourcefilename
FROM MatterDataset , DocumentClassification
ON DocumentClassification.documentid = MatterDataset.id
AND MatterDataset.matterid = DocumentClassification.matterid
LEFT JOIN PrivilegeLog ON MatterDataset.id = PrivilegeLog.documentparentid
AND MatterDataset.matterid = PrivilegeLog.matterid
WHERE PrivilegeLog.privilegelogitemid IS NULL
AND MatterDataset.matterid = '100'
AND (DocumentClassification.classid = 1 OR DocumentClassification.classid = 2)
AND DocumentClassification.userid = -99
ORDER BY MatterDataset.id LIMIT 1;
A nice lesson in: just because you can in SQL doesn't mean you should.
What this statement does is essentially locates items in the MatterDataset Table that are NOT in the PrivilegeLog table. The LEFT JOIN and IS NULL syntax locate the items that are "missing." That is, i want to find items that are in MatterDataset but not yet in PrivilegeLog and return those items.

Categories