Order by a temporary column computed using case when - java

I have 2 entities Book and BookProperty in a one-to-many relationship. I need to retrieve books sorted by upload_date, english language books coming first using JPA Criteria with pagination and stuff (JPQL/native sql is not an option).
This is what the native query for this operation would look like (postgres):
select distinct
b.book_id,
b.name, b.author, ... ,
b.upload_date,
case when p.property_name='language' and lower(p.property_value)='english'
then 0 else 1 end as book_language,
from books b
left outer join book_properties p
on b.book_id=p.book_id
order by book_language asc, b.upload_date desc;
The problem is that I can't get the 'case' part to be selected via criteria API and sort by it. I know that it is possible via a multiselect and a tuple but I would like to avoid that, because really I do not need this column in the application. I would like to just retrieve sorted Book objects and not tuples of (Book, Integer).
I tried to move the case part into the order by and managed to compute the query via JPA Criteria, but in that case setDistinct(true) resulted in an error, because all columns in order by must be part of distinct. So moving the case part to order doesn't look like an option.
Please help me implement this query using JPA Criteria, preferably without using tuples or wrapper objects, but that will do as well if there are no other options.

Related

CLOB and CriteriaQuery

I have an entity that has a CLOB attribute:
public class EntityS {
...
#Lob
private String description;
}
To retrieve certain EntityS from the DB we use a CriteriaQuery where we need the results to be unique, so we do:
query.where(builder.and(predicates.toArray(new Predicate[predicates.size()]))).distinct(true).orderBy(builder.asc(root.<Long> get(EntityS_.id)));
If we do that we get the following error:
ORA-00932: inconsistent datatypes: expected - got CLOB
I know that's because you cannot use distinct when selecting a CLOB. But we need the CLOB. Is there a workaround for this using CriteriaQuery with Predicates and so on?
We are using an ugly workaround getting rid of the .unique(true) and then filtering the results, but that's crap. We are using it only to be able to keep on developing the app, but we need a better solution and I don't seem to find one...
In case you are using Hibernate as persistence provider, you can specify the following query hint:
query.setHint(QueryHints.HINT_PASS_DISTINCT_THROUGH, false);
This way, "distinct" is not passed through to the SQL command, but Hibernate will take care of returning only distinct values.
See here for more information: https://thoughts-on-java.org/hibernate-tips-apply-distinct-to-jpql-but-not-sql-query/
Thinking outside the box - I have no idea if this will work, but perhaps it is worth a shot. (I tested it and it seems to work, but I created a table with just one column, CLOB data type, and two rows, both with the value to_clob('abcd') - of course it should work on that setup.)
To de-duplicate, compute a hash of each clob, and instruct Oracle to compute a row number partitioned by the hash value and ordered by nothing (null). Then select just the rows where the row number is 1. Something like below (t is the table I created, with one CLOB column called c).
I expect that execution time should be reasonably good. The biggest concern, of course, is collisions. How important is it that you not miss ANY of the CLOBs, and how many rows do you have in the base table in the first place? Is something like "one chance in a billion" of having a collision acceptable?
select c
from (
select c, row_number() over (partition by dbms_crypto.hash(c, 3) order by null) as rn
from t
)
where rn = 1;
Note - the user (your application, in your case) must have EXECUTE privilege on SYS.DBMS_CRYPTO. A DBA can grant it if needed.

Better to query once, then organize objects based on returned column value, or query twice with different conditions?

I have a table which I need to query, then organize the returned objects into two different lists based on a column value. I can either query the table once, retrieving the column by which I would differentiate the objects and arrange them by looping through the result set, or I can query twice with two different conditions and avoid the sorting process. Which method is generally better practice?
MY_TABLE
NAME AGE TYPE
John 25 A
Sarah 30 B
Rick 22 A
Susan 43 B
Either SELECT * FROM MY_TABLE, then sort in code based on returned types, or
SELECT NAME, AGE FROM MY_TABLE WHERE TYPE = 'A' followed by
SELECT NAME, AGE FROM MY_TABLE WHERE TYPE = 'B'
Logically, a DB query from a Java code will be more expensive than a loop within the code because querying the DB involves several steps such as connecting to DB, creating the SQL query, firing the query and getting the results back.
Besides, something can go wrong between firing the first and second query.
With an optimized single query and looping with the code, you can save a lot of time than firing two queries.
In your case, you can sort in the query itself if it helps:
SELECT * FROM MY_TABLE ORDER BY TYPE
In future if there are more types added to your table, you need not fire an additional query to retrieve it.
It is heavily dependant on the context. If each list is really huge, I would let the database to the hard part of the job with 2 queries. At the opposite, in a web application using a farm of application servers and a central database I would use one single query.
For the general use case, IMHO, I will save database resource because it is a current point of congestion and use only only query.
The only objective argument I can find is that the splitting of the list occurs in memory with a hyper simple algorithm and in a single JVM, where each query requires a bit of initialization and may involve disk access or loading of index pages.
In general, one query performs better.
Also, with issuing two queries you can potentially get inconsistent results (which may be fixed with higher transaction isolation level though ).
In any case I believe you still need to iterate through resultset (either directly or by using framework's methods that return collections).
From the database point of view, you optimally have exactly one statement that fetches exactly everything you need and nothing else. Therefore, your first option is better. But don't generalize that answer in way that makes you query more data than needed. It's a common mistake for beginners to select all rows from a table (no where clause) and do the filtering in code instead of letting the database do its job.
It also depends on your dataset volume, for instance if you have a large data set, doing a select * without any condition might take some time, but if you have an index on your 'TYPE' column, then adding a where clause will reduce the time taken to execute the query. If you are dealing with a small data set, then doing a select * followed with your logic in the java code is a better approach
There are four main bottlenecks involved in querying a database.
The query itself - how long the query takes to execute on the server depends on indexes, table sizes etc.
The data volume of the results - there could be hundreds of columns or huge fields and all this data must be serialised and transported across the network to your client.
The processing of the data - java must walk the query results gathering the data it wants.
Maintaining the query - it takes manpower to maintain queries, simple ones cost little but complex ones can be a nightmare.
By careful consideration it should be possible to work out a balance between all four of these factors - it is unlikely that you will get the right answer without doing so.
You can query by two conditions:
SELECT * FROM MY_TABLE WHERE TYPE = 'A' OR TYPE = 'B'
This will do both for you at once, and if you want them sorted, you could do the same, but just add an order by keyword:
SELECT * FROM MY_TABLE WHERE TYPE = 'A' OR TYPE = 'B' ORDER BY TYPE ASC
This will sort the results by type, in ascending order.
EDIT:
I didn't notice that originally you wanted two different lists. In that case, you could just do this query, and then find the index where the type changes from 'A' to 'B' and copy the data into two arrays.

Queryproblem with mongodb

I have 2 collections in a mongodb database.
example:
employee(collection)
_id
name
gender
homelocation (double[] indexed as geodata)
companies_worked_in (reference, list of companies)
companies(collection)
_id
name
...
Now I need to query all companies who's name start with "wha" and has/had employees which live near (13.444519, 52.512878) ie.
How do I do that without taking too long?
With SQL it would've been a simple join (without the geospatiol search of course... :( )
You can issue 2 queries. (Queries I wrote are in JavaScript)
First query extracts all companies whose name starts with wha.
db.companies.find({name: {$regex: "^wha"}}, {_id: 1})
Second query can be like
db.employees.find({homelocation: {$near: [x,y]}, companies_worked_in: {$in: [result_from_above_query]} }, {companies_worked_in: 1})
Now simply filter companies_worked_in and have only those companies whose name starts with wha. I know it seems like the first query is useless in this case. But a lot of records would be filtered by $in query.
You might have to write some intermediate code between this two queries. I know this is not a single query solution. But this is one possible way to go and performance is also good depending upon what fields you index upon. In this case consider creating index on name (companies collection) and homelocation (geo-index) + companies_worked_in (employee collection) would help you gain performance.
P.S.
I doubt if you could create a composite index over homelocation and companies_worked_in, since both are arrays. You would have to index on one of these fields only. You might not be able to have a composite index.
Suggestion
Store the company name as well in employee collection. That ways you can avoid first query.

sql java queries

I have a java object 'star' that consists of two columns, string name (the name of the star) and string List fans (the list of fans of this star). I'd like to persist this class using JPA1 or hibernate. I've done so using the annotation #collectionOfElements on the list. It works fine, and creates two tables.
Now I'd like to get all stars whose fans are 'alice' or 'bob' or 'charlie'. How can I do that in the easiest way (only one query rather than 3, and without using 'OR' statements if possible), using jpa queries (hibernate if it's a must), and without retrieving the whole list of fans ?
Thanks
The following query should help you:
select s.* from star s where s.fans.name in ('alice', 'bob', 'charlie')

How to order by count() in JPA

I am using This JPA-Query:
SELECT DISTINCT e.label FROM Entity e
GROUP BY e.label
ORDER BY COUNT(e.label) DESC
I get no errors and the results are sorted almost correct but there are some values wrong (either two values are flipped or some single values are completly misplaced)
EDIT:
Adding COUNT(e.label) to my SELECT clause resolves this problem for this query.
But in a similar query which also contains a WHERE clause the problem persists:
SELECT DISTINCT e.label, COUNT(e.label) FROM Entity e
WHERE TYPE(e.cat) = :category
GROUP BY e.label
ORDER BY COUNT(e.label) DESC
You might need to include the COUNT(e.label) in your SELECT clause:
SELECT DISTINCT e.label, COUNT(e.label)
FROM Entity e
GROUP BY e.label
ORDER BY COUNT(e.label) DESC
UPDATE: Regarding the second query please read section 8.6. Polymorphic queries of the EntityManager documentation. It seems that if you make your queries in a way that requires multiple SELECTs, then the ORDER BY won't work anymore. Using the TYPE keyword seems to be such a case. A quote from the above link:
The following query would return all persistent objects:
from java.lang.Object o // HQL only
The interface Named might be implemented by various persistent classes:
from Named n, Named m where n.name = m.name // HQL only
Note that these last two queries will require more than one SQL SELECT. This means that the order by clause does not correctly order the whole result set. (It also means you can't call these queries using Query.scroll().)
For whatever reason the following style named query didn't work for me:
SELECT DISTINCT e.label, COUNT(e.label)
FROM Entity e
GROUP BY e.label
ORDER BY COUNT(e.label) DESC
It could be because I am using an old version of Hibernate. I got the order by working by using a number to choose the column to sort by like this:
SELECT DISTINCT e.label, COUNT(e.label)
FROM Entity e
GROUP BY e.label
ORDER BY 2 DESC
Can't see how the order could be incorrect. What is the incorrect result?
What is the SQL that is generated, if you try the same SQL directly on the database, does it give the same incorrect order?
What database are you using?
You could always sort in Java instead using sort().

Categories