JPA strange behavior when using SELECT - java

I am new to Java and try developing a SWing app for library using JPA controller generated.
When I try to select result from sql server database, I use this command
CriteriaBuilder criteriaBuilder = em.getCriteriaBuilder();
CriteriaQuery<BookTitles> cq = criteriaBuilder.createQuery(BookTitles.class);
cq.select(cq.from(BookTitles.class)).where(criteriaBuilder.isNull(cq.from(BookTitles.class).get("status")));
This command, however, returns 9 times of rows in db. For example, if db has 10 rows, it will repeat this 10 rows around 9 times and return a list with 90 elements.
Instead of this code, I changed to
CriteriaBuilder criteriaBuilder = em.getCriteriaBuilder();
CriteriaQuery<BookTitles> cq = criteriaBuilder.createQuery(BookTitles.class);
Root<BookTitles> root = cq.from(BookTitles.class);
cq.select(root).where(criteriaBuilder.isNull(root.get("status")));
and the results will be the same as listed in db.
The only different between these two codes is that instead of passing cq.from(...) directly to select(), I pass result of cq.from(...).
Personally, I donot think there is any differences between these two ways of coding, but the results tell the other way.
Can someone take time to explain?

It's not strange behavior
By using the CriteriaBuilder method twice, you are setting two tables in that clause for what the Cartesian product does.
As you can see in the documentation
https://docs.oracle.com/javaee/7/api/javax/persistence/criteria/AbstractQuery.html#from-java.lang.Class-
"Create and add a query root corresponding to the given entity, forming a cartesian product with any existing roots."
So the correct way is the second one, storing the table that forms the from clause in a variable, and using this instead of adding more tables to the from clause with the criteriaquery from method.

Related

Optimize JPA dinamyc count query

Having the typical method which returns a paginated result, using CriteriaBuilder and performing 2 queries:
one that counts the total number of results
and another one that gives us the subset for the specified page
We have noticed that the first query, JPA does not optimize it at all because it's using the exists (from Oracle).
Java code:
Root<Foo> from = criteriaQuery.from(Foo.class);
//... predicates
CriteriaQuery<Long> countQuery = criteriaBuilder.createQuery(Long.class)
.select(criteriaBuilder.countDistinct(from))
.where(predicates.toArray(new Predicate[predicates.size()]));
Long numberResults = entityManager.createQuery(countQuery).getSingleResult();
SQL generated query:
SELECT COUNT(t0.REFERENCE)
FROM foo t0
WHERE EXISTS (
SELECT t1.REFERENCE
FROM foo t1
WHERE ((((t0.REFERENCE = t1.REFERENCE) AND (t0.VERSION_NUM = t1.VERSION_NUM)) AND (t0.ISSUER = t1.ISSUER)) AND (t1.REFERENCE LIKE ? AND (t1.VERSION_STATUS = ?)))
);
How do I avoid using the exists? Is there something wrong with the java code?
For different reasons, this issue and this related article enumerate some of them, EclipseLink uses EXISTS in the countDistinct operation implementation.
Although I can agree with you, be aware that the performance offered by EXISTS in Oracle is in fact very dependent of the use case, and it doesn't have to be poor. Please, consider review this mythical blog entry in the Tom Kyte blob.
So my advice is, please, keep using the generated code and corresponding SQL.
If you need or want to use a different approach, a perhaps more performant way of counting the records could be fetching the ids of the entities that match the provided predicates (the actual performance in fact is mostly dependent on these predicates in fact), and count the results in memory, with Java. I mean:
CriteriaBuilder cb = entityManager.getCriteriaBuilder();
// I assume reference is String here
CriteriaQuery<String> query = cb.createQuery(String.class);
Root<Foo> root = query.from(Foo.class);
query
.select(root.get("reference"))
.distinct(true)
.where(predicates.toArray(new Predicate[predicates.size()]))
;
List<String> references = entityManager.createQuery(query).getResultList();
int count = references.size();
Although I think it is always not advisable, if the amount of data is not large, you could even fetch the results once from the database, and do the paging in memory with Java, it is straightforward using subList, for instance.
At a final word, AFAIK other JPA providers such as Hibernate implements count in a different way: if switching the JPA provider is an option you could try using it instead.
With or without EXISTS, the query plans are identical. The only optimisation would be to return COUNT() and the result in the same query, easy to do in SQL with "OVER()". But mapping the Foo.class on a view and adding a transient column to contain the count will complicate a lot of other parts of the application, and mapping the result of paginated queries on a new CountedFoo.class will also complicate the solution.

How to use Querydsl to construct complex predicate that involves multiple tables?

I am trying to utilize Querydsl to fetch some results from a table. So far, this is what I have tried -
Assume there are 5 entities named T1..T5. And I am trying to do this SQL query in Querydsl -
SELECT T1.*
FROM T1,T2,T3,T4,T5
WHERE T1.A=T2.A
AND T2.B=T5.B
AND T4.C=T2.C
AND T1.B=1234;
I tried the following, but the Hibernate query keeps running, and does not seem to end.
booleanBuilder.and(JPAExpressions.select(qT1).from(qT1,qT2,qT3,qT4,qT5)
.where(
qT1.a.eq(qT2.a)
.and(qT1.a.eq(qT2.a))
... // and so on
.exists());
I am using the Repository that extends QuerydslPredicateExecutor and using findAll to execute this. The problem is that the query takes forever to run. And I am interested only in the first result that may appear.
So, where am I going wrong that is making the query execute forever?
Edit:
I opted to use the JPAQuery instead. And of course, the Hibernate query generated is the same. Here is my JPAQuery.
JPQLQuery jpqlQuery = new JPAQuery(entityManager);
jpqlQuery.select(qT1).from(qT1, qT2, qT3, qT4, qT5).where(booleanBuilder);
return jpqlQuery.fetch();
How do I incorporate the limit in the above JPAQuery so that only the first result is fetched?
The complexity is not in the predicate or in QueryDSL, but in the fact that you're executing it in a subquery that has to be executed for every row in the result. Depending on the total result set size, this may become increasingly difficult to compute. It is however equally complex among QueryDSL, Hibernates HQL, JPA's JPQL or your databases SQL. So the SQL you're trying to generate, will be just as slow.
You might succeed at optimising the query using a limit clause. Adding a limit clause to query in QueryDSL is quite trivial: .limit(1). So then your query becomes:
JPQLQuery jpqlQuery = new JPAQuery(entityManager);
jpqlQuery.select(qT1).from(qT1, qT2, qT3, qT4, qT5).where(booleanBuilder);
jpqlQuery.limit(1);
return jpqlQuery.fetch();

CLOB and CriteriaQuery

I have an entity that has a CLOB attribute:
public class EntityS {
...
#Lob
private String description;
}
To retrieve certain EntityS from the DB we use a CriteriaQuery where we need the results to be unique, so we do:
query.where(builder.and(predicates.toArray(new Predicate[predicates.size()]))).distinct(true).orderBy(builder.asc(root.<Long> get(EntityS_.id)));
If we do that we get the following error:
ORA-00932: inconsistent datatypes: expected - got CLOB
I know that's because you cannot use distinct when selecting a CLOB. But we need the CLOB. Is there a workaround for this using CriteriaQuery with Predicates and so on?
We are using an ugly workaround getting rid of the .unique(true) and then filtering the results, but that's crap. We are using it only to be able to keep on developing the app, but we need a better solution and I don't seem to find one...
In case you are using Hibernate as persistence provider, you can specify the following query hint:
query.setHint(QueryHints.HINT_PASS_DISTINCT_THROUGH, false);
This way, "distinct" is not passed through to the SQL command, but Hibernate will take care of returning only distinct values.
See here for more information: https://thoughts-on-java.org/hibernate-tips-apply-distinct-to-jpql-but-not-sql-query/
Thinking outside the box - I have no idea if this will work, but perhaps it is worth a shot. (I tested it and it seems to work, but I created a table with just one column, CLOB data type, and two rows, both with the value to_clob('abcd') - of course it should work on that setup.)
To de-duplicate, compute a hash of each clob, and instruct Oracle to compute a row number partitioned by the hash value and ordered by nothing (null). Then select just the rows where the row number is 1. Something like below (t is the table I created, with one CLOB column called c).
I expect that execution time should be reasonably good. The biggest concern, of course, is collisions. How important is it that you not miss ANY of the CLOBs, and how many rows do you have in the base table in the first place? Is something like "one chance in a billion" of having a collision acceptable?
select c
from (
select c, row_number() over (partition by dbms_crypto.hash(c, 3) order by null) as rn
from t
)
where rn = 1;
Note - the user (your application, in your case) must have EXECUTE privilege on SYS.DBMS_CRYPTO. A DBA can grant it if needed.

hibernate hql and setMaxResults

I've got some quarrels with hibernate.
My query, yet optimized, is quite heavy. One of my optimization consist on limiting the resultset returned.
So with hibernate I've used the method setMaxResultSet, but I hit the same problem described in this post:
Hibernate: Pagination with setFirstResult and setMaxResult
(the issue is that using setMaxResultSet hibernate in some cases wrap the query like this:
select * from (your query) where rownum <= :rownum)
So, the solution in that case was to add an orderBy, bu I've millions of records and an orderBy kills the execution time of the query.
I've managed to overcome the problem using the createNativeQuery and passing the exact query I need (something like "my query where rownum <= :rownum" instead of "select * from (your query) where rownum <= :rownum", and goodbye portability), but honestly I don't get why Hibernate acts like this...
As the previous post suggests, hibernate resolve an SQL like that as long as your query "is not stable" because, if I haven't misunderstand, the order of the records may not be the same between two executions, but I don't get how that method could solve this stability problem.
I am using the same pagination in hibernate.the HQL is given below.it may be useful for you.
(i) initially you should use this Query
List<Object> Entity_Cls_Lst= Objclass.createQuery("from library where book_id>Book_ID order by book_id").list();
(ii) after scrolling you should take last result data's Book_ID and pass to the query in where condition.
List<Object> Entity_Cls_Lst= Objclass.createQuery("from library where book_id>Book_ID order by book_id").setMaxResults(MAX_RECORDS).list();

Implementing result paging in hibernate (getting total number of rows)

How do I implement paging in Hibernate? The Query objects has methods called setMaxResults and setFirstResult which are certainly helpful. But where can I get the total number of results, so that I can show link to last page of results, and print things such as results 200 to 250 of xxx?
You can use Query.setMaxResults(int results) and Query.setFirstResult(int offset).
Editing too: There's no way to know how many results you'll get. So, first you must query with "select count(*)...". A little ugly, IMHO.
You must do a separate query to get the max results...and in the case where between time A of the first time the client issues a paging request to time B when another request is issued, if new records are added or some records now fit the criteria then you have to query the max again to reflect such. I usually do this in HQL like this
Integer count = (Integer) session.createQuery("select count(*) from ....").uniqueResult();
for Criteria queries I usually push my data into a DTO like this
ScrollableResults scrollable = criteria.scroll(ScrollMode.SCROLL_INSENSITIVE);
if(scrollable.last()){//returns true if there is a resultset
genericDTO.setTotalCount(scrollable.getRowNumber() + 1);
criteria.setFirstResult(command.getStart())
.setMaxResults(command.getLimit());
genericDTO.setLineItems(Collections.unmodifiableList(criteria.list()));
}
scrollable.close();
return genericDTO;
you could perform two queries - a count(*) type query, which should be cheap if you are not joining too many tables together, and a second query that has the limits set. Then you know how many items exists but only grab the ones being viewed.
You can do one thing. just prepare Criteria query as per your busness requirement with all Predicates , sorting , searching etc.
and then do as below :-
CriteriaBuilder criteriaBuilder = em.getCriteriaBuilder();
CriteriaQuery<Feedback> criteriaQuery = criteriaBuilder.createQuery(Feedback.class);
//Just Prepare your all Predicates as per your business need.
//eg :-
yourPredicateAsPerYourBusnessNeed = criteriaBuilder.equal(Root.get("applicationName"), applicationName);
criteriaQuery.where(yourPredicateAsPerYourBusnessNeed).distinct(true);
TypedQuery<Feedback> criteriaQueryWithPredicate = em.createQuery(criteriaQuery);
//Getting total Count Here
Long totalCount = criteriaQueryWithPredicate.getResultStream().distinct().count();
Now we have our actual data with us as above with total count , right.
So now we can apply pagination on the data we have in our hand above , as below :-
List<Feedback> feedbackList = criteriaQueryWithPredicate.setFirstResult(offset).setMaxResults(pageSize).getResultList();
Now You can prepare a wrapper with your List return by DB along with the totalCount , startingPageNo that is offset here in this case, page Size etc and can return to your service / controller class.
I am 101 % sure , this will solve your problem, Because I was facing same problem and sorted it out same way.
Thanks- Sunil Kumar Mali
You can just setMaxResults to the maximum number of rows you want returned. There is no harm in setting this value greater than the number of actual rows available. The problem the other solutions is they assume the ordering of records remains the same each repeat of the query, and there are no changes going on between commands.
To avoid that if you really want to scroll through results, it is best to use the ScrollableResults. Don't throw this object away between paging, but use it to keep the records in the same order. To find out the number of records from the ScrollableResults, you can simply move to the last() position, and then get the row number. Remember to add 1 to this value, since row numbers start counting at 0.
I personally think you should handle the paging in the front-end. I know this isn't that efficiƫnt but at least it would be less error prone.
If you would use the count(*) thing what would happen if records get deleted from the table in between requests for a certain page? Lots of things could go wrong this way.

Categories