How to use Querydsl to construct complex predicate that involves multiple tables? - java

I am trying to utilize Querydsl to fetch some results from a table. So far, this is what I have tried -
Assume there are 5 entities named T1..T5. And I am trying to do this SQL query in Querydsl -
SELECT T1.*
FROM T1,T2,T3,T4,T5
WHERE T1.A=T2.A
AND T2.B=T5.B
AND T4.C=T2.C
AND T1.B=1234;
I tried the following, but the Hibernate query keeps running, and does not seem to end.
booleanBuilder.and(JPAExpressions.select(qT1).from(qT1,qT2,qT3,qT4,qT5)
.where(
qT1.a.eq(qT2.a)
.and(qT1.a.eq(qT2.a))
... // and so on
.exists());
I am using the Repository that extends QuerydslPredicateExecutor and using findAll to execute this. The problem is that the query takes forever to run. And I am interested only in the first result that may appear.
So, where am I going wrong that is making the query execute forever?
Edit:
I opted to use the JPAQuery instead. And of course, the Hibernate query generated is the same. Here is my JPAQuery.
JPQLQuery jpqlQuery = new JPAQuery(entityManager);
jpqlQuery.select(qT1).from(qT1, qT2, qT3, qT4, qT5).where(booleanBuilder);
return jpqlQuery.fetch();
How do I incorporate the limit in the above JPAQuery so that only the first result is fetched?

The complexity is not in the predicate or in QueryDSL, but in the fact that you're executing it in a subquery that has to be executed for every row in the result. Depending on the total result set size, this may become increasingly difficult to compute. It is however equally complex among QueryDSL, Hibernates HQL, JPA's JPQL or your databases SQL. So the SQL you're trying to generate, will be just as slow.
You might succeed at optimising the query using a limit clause. Adding a limit clause to query in QueryDSL is quite trivial: .limit(1). So then your query becomes:
JPQLQuery jpqlQuery = new JPAQuery(entityManager);
jpqlQuery.select(qT1).from(qT1, qT2, qT3, qT4, qT5).where(booleanBuilder);
jpqlQuery.limit(1);
return jpqlQuery.fetch();

Related

Optimize JPA dinamyc count query

Having the typical method which returns a paginated result, using CriteriaBuilder and performing 2 queries:
one that counts the total number of results
and another one that gives us the subset for the specified page
We have noticed that the first query, JPA does not optimize it at all because it's using the exists (from Oracle).
Java code:
Root<Foo> from = criteriaQuery.from(Foo.class);
//... predicates
CriteriaQuery<Long> countQuery = criteriaBuilder.createQuery(Long.class)
.select(criteriaBuilder.countDistinct(from))
.where(predicates.toArray(new Predicate[predicates.size()]));
Long numberResults = entityManager.createQuery(countQuery).getSingleResult();
SQL generated query:
SELECT COUNT(t0.REFERENCE)
FROM foo t0
WHERE EXISTS (
SELECT t1.REFERENCE
FROM foo t1
WHERE ((((t0.REFERENCE = t1.REFERENCE) AND (t0.VERSION_NUM = t1.VERSION_NUM)) AND (t0.ISSUER = t1.ISSUER)) AND (t1.REFERENCE LIKE ? AND (t1.VERSION_STATUS = ?)))
);
How do I avoid using the exists? Is there something wrong with the java code?
For different reasons, this issue and this related article enumerate some of them, EclipseLink uses EXISTS in the countDistinct operation implementation.
Although I can agree with you, be aware that the performance offered by EXISTS in Oracle is in fact very dependent of the use case, and it doesn't have to be poor. Please, consider review this mythical blog entry in the Tom Kyte blob.
So my advice is, please, keep using the generated code and corresponding SQL.
If you need or want to use a different approach, a perhaps more performant way of counting the records could be fetching the ids of the entities that match the provided predicates (the actual performance in fact is mostly dependent on these predicates in fact), and count the results in memory, with Java. I mean:
CriteriaBuilder cb = entityManager.getCriteriaBuilder();
// I assume reference is String here
CriteriaQuery<String> query = cb.createQuery(String.class);
Root<Foo> root = query.from(Foo.class);
query
.select(root.get("reference"))
.distinct(true)
.where(predicates.toArray(new Predicate[predicates.size()]))
;
List<String> references = entityManager.createQuery(query).getResultList();
int count = references.size();
Although I think it is always not advisable, if the amount of data is not large, you could even fetch the results once from the database, and do the paging in memory with Java, it is straightforward using subList, for instance.
At a final word, AFAIK other JPA providers such as Hibernate implements count in a different way: if switching the JPA provider is an option you could try using it instead.
With or without EXISTS, the query plans are identical. The only optimisation would be to return COUNT() and the result in the same query, easy to do in SQL with "OVER()". But mapping the Foo.class on a view and adding a transient column to contain the count will complicate a lot of other parts of the application, and mapping the result of paginated queries on a new CountedFoo.class will also complicate the solution.

JOOQ multiple select count in one connection with PostgreSQL

I have a table SUBSCRIPTION, I want to run multiple selectCount written with JOOQ in one connection with different predicates to the database.
To do so, I have created a list of queries:
List<Query> countQueries = channels.stream().map(c ->
selectCount().from(SUBSCRIPTION)
.innerJoin(SENDER).on(SENDER.ID.equal(SUBSCRIPTION.SENDER_ID))
.innerJoin(CHANNEL).on(CHANNEL.ID.equal(SUBSCRIPTION.CHANNEL_ID))
.where(SENDER.CODE.equal(senderCode))
.and(CHANNEL.CODE.equal(c))
).collect(toList());
And finally, I have launched this list of queries using batch:
using(configuration).batch(countQueries).execute();
I have expected to have the results of the above queries in the return values of execute, but I get an array of integer filled with 0 values.
Is this the right way to run multiple selectCount using JOOQ?
What is the signification of the integer array returned by the execute method?
I have checked this link, in the JOOQ blog, talking about "How to Calculate Multiple Aggregate Functions in a Single Query", but It's just about SQL queries, no JOOQ dialects.
Comments on your assumptions
I have expected to have the results of the above queries in the return values of execute, but I get an array of integer filled with 0 values.
The batch() API can only be used for DML queries (INSERT, UPDATE, DELETE), just like with native JDBC. I mean, you can run the queries as a batch, but you cannot fetch the results this way.
I have checked this link, in the JOOQ blog, talking about "How to Calculate Multiple Aggregate Functions in a Single Query", but It's just about SQL queries, no JOOQ dialects.
Plain SQL queries almost always translate quite literally to jOOQ, so you can apply the technique from that article also in your case. In fact, you should! Running so many queries is definitely not a good idea.
Translating that linked query to jOOQ
So, let's look at how to translate that plain SQL example from the link to your case:
Record record =
ctx.select(
channels.stream()
.map(c -> count().filterWhere(CHANNEL.CODE.equal(c)).as(c))
.collect(toList())
)
.from(SUBSCRIPTION)
.innerJoin(SENDER).on(SENDER.ID.equal(SUBSCRIPTION.SENDER_ID))
.innerJoin(CHANNEL).on(CHANNEL.ID.equal(SUBSCRIPTION.CHANNEL_ID))
.where(SENDER.CODE.equal(senderCode))
.and(CHANNEL.CODE.in(channels)) // Not strictly necessary, but might speed up things
.fetch();
This will produce a single record containing all the count values.
As always, this is assuming the following static import
import static org.jooq.impl.DSL.*;
Using classic GROUP BY
Of course, you can also just use a classic GROUP BY in your particular case. This might even be a bit faster:
Result<?> result =
ctx.select(CHANNEL.CODE, count())
.from(SUBSCRIPTION)
.innerJoin(SENDER).on(SENDER.ID.equal(SUBSCRIPTION.SENDER_ID))
.innerJoin(CHANNEL).on(CHANNEL.ID.equal(SUBSCRIPTION.CHANNEL_ID))
.where(SENDER.CODE.equal(senderCode))
.and(CHANNEL.CODE.in(channels)) // This time, you need to filter
.groupBy(CHANNEL.CODE)
.fetchOne();
This now produces a table with one count value per code. Alternatively, fetch this into a Map<String, Integer>:
Map<String, Integer> map =
ctx.select(CHANNEL.CODE, count())
.from(SUBSCRIPTION)
.innerJoin(SENDER).on(SENDER.ID.equal(SUBSCRIPTION.SENDER_ID))
.innerJoin(CHANNEL).on(CHANNEL.ID.equal(SUBSCRIPTION.CHANNEL_ID))
.where(SENDER.CODE.equal(senderCode))
.and(CHANNEL.CODE.in(channels))
.groupBy(CHANNEL.CODE)
.fetchMap(CHANNEL.CODE, count());

Use existing JPAQuery as subquery

I have an instance of JPAQuery<?> and need to retrieve the count. However, since the table may contain many items (millions), I want to limit the count to a given maximum, say 50,000.
The current QueryDSL-Code effectively does this:
query.fetchCount();
Now my desired modifications are quite trivial in raw sql:
select count(*) from (<whatever query> limit 50000);
However, I do not know how I would express this in querydsl. The following code is not correct, because .from() takes an entity path, but query is a query:
JPAExpressions.select(Wildcard.all)
.from(query.limit(50000))
.fetchCount();
I am using querydsl 4.
JPAExpressions.select(Wildcard.all) returns a child of SimplyQuery, which you can call limit on.
JPAExpressions.select(Wildcard.all)
.from(entity)
.limit(50000)
.fetchCount();

hibernate hql and setMaxResults

I've got some quarrels with hibernate.
My query, yet optimized, is quite heavy. One of my optimization consist on limiting the resultset returned.
So with hibernate I've used the method setMaxResultSet, but I hit the same problem described in this post:
Hibernate: Pagination with setFirstResult and setMaxResult
(the issue is that using setMaxResultSet hibernate in some cases wrap the query like this:
select * from (your query) where rownum <= :rownum)
So, the solution in that case was to add an orderBy, bu I've millions of records and an orderBy kills the execution time of the query.
I've managed to overcome the problem using the createNativeQuery and passing the exact query I need (something like "my query where rownum <= :rownum" instead of "select * from (your query) where rownum <= :rownum", and goodbye portability), but honestly I don't get why Hibernate acts like this...
As the previous post suggests, hibernate resolve an SQL like that as long as your query "is not stable" because, if I haven't misunderstand, the order of the records may not be the same between two executions, but I don't get how that method could solve this stability problem.
I am using the same pagination in hibernate.the HQL is given below.it may be useful for you.
(i) initially you should use this Query
List<Object> Entity_Cls_Lst= Objclass.createQuery("from library where book_id>Book_ID order by book_id").list();
(ii) after scrolling you should take last result data's Book_ID and pass to the query in where condition.
List<Object> Entity_Cls_Lst= Objclass.createQuery("from library where book_id>Book_ID order by book_id").setMaxResults(MAX_RECORDS).list();

Hibernate Criteria API: get n random rows

I can't figure out how to fetch n random rows from a criteria instance:
Criteria criteria = session.createCriteria(Table.class);
criteria.add(Restrictions.eq('fieldVariable', anyValue));
...
Then what? I can't find any doc with Criteria API
Does it mean I should use HQL instead?
Thanx!
EDIT: I get the number of rows by:
int max = criteria.setProjecxtion(Projections.rowCount()).uniqueResult();
How do I fetch n random rows with indexes between 0 and max?
Thx again!
Actually it is possible with Criteria and a little bit of tweaking. Here is how:
Criteria criteria = session.createCriteria(Table.class);
criteria.add(Restrictions.eq("fieldVariable", anyValue));
criteria.add(Restrictions.sqlRestriction("1=1 order by rand()"));
criteria.setMaxResults(5);
return criteria.list();
any Restrictions.sqlRestriction will add keyword 'and'; so to nullify its effect,
we shall add a dummy condition and inject our rand() function.
First of all, be aware that there is no standard way to do this in SQL, each database engine uses its own proprietary syntax1. With MySQL, the SQL statement to get 5 random rows would be:
SELECT column FROM table
ORDER BY RAND()
LIMIT 5
And you could write this query in HQL because the order by clause in HQL is passed through to the database so you can use any function.
String query = "SELECT e.attribute FROM MyEntity e ORDER BY RAND()";
Query q = em.createQuery(query);
q.setMaxResults(5);
However, unlike HQL, the Criteria API currently doesn't support ORDER BY Native SQL (see HHH-2381) and in the current state, you would have to subclass the Order class to implement this feature. This is doable, refer to the Jira issue, but not available out of the box.
So, if really you need this query, my recommendation would be to use HQL. Just keep in mind it won't be portable.
1 Other readers might want to check the post SQL to Select a random row from a database table to see how to implement this with MySQL, PostgreSQL, Microsoft SQL Server, IBM DB2 and Oracle.
The Criteria API doesn't offer facilities for this. In MySQL however, you can use ORDER BY RAND() LIMIT n for this where n represents the number of random rows you'd like to fetch.
SELECT col1, col2, col3 FROM tbl ORDER BY RAND() LIMIT :n
You indeed need to execute it as HQL.
You can not fetch random rows efficiently, sorry. Hibernate can only do what SQL does, and random row fetch simply is not part of any standard SQL implementation I know - actually it is to my knowledge not part of ANY SQL that I am aware of (anyone please enlight me).
And as Hibernate is an O/R mapper, and not a wonder machine, it can only do what the underlying database supports.
If you have a known filed with ascending numbers and know start and end, you can generate a random number on the computer and ask for that row.
The answer by #PSV Bhat is difficult if you are dynamically generating your Criteria. Here is a solution that extends hibernate Order class:
import org.hibernate.Criteria;
import org.hibernate.criterion.Order;
private void addOrderByToCriteria(Criteria criteria) {
criteria.addOrder(Order.asc("foobar"));
criteria.addOrder(ORDER_RANDOM);
}
private static final OrderRandom ORDER_RANDOM = new OrderRandom();
private static class OrderRandom extends Order {
public OrderRandom() {
super("", false);
}
#Override
public String toSqlString(Criteria criteria, CriteriaQuery criteriaQuery) {
return "RANDOM()"; // or RAND() or whatever this is in your dialect
}
}

Categories