How to select multiple columns in subquery using JPA Criteria

How to select multiple columns in subquery using JPA Criteria - java

I am trying to write the below query using JPA criteria but I am not able to select the multiple columns in a subquery.
SELECT a.id, a.firstName, a.lastName
FROM PORTRAIT a
JOIN (SELECT firstName, lastName
FROM PORTRAIT
GROUP BY firstName, lastName
HAVING count(id) > 1 ) b
ON b.firstName = a.firstName
AND b.lastName = a.lastName
ORDER BY a.lastName asc
or
SELECT a.id, a.FIRSTNAME, a.LASTNAME
FROM PORTRAIT a where exists (
SELECT b.firstName, b.lastName
FROM PORTRAIT b
WHERE b.firstName = a.firstName
AND b.lastName = a.lastName
GROUP BY b.firstName, b.lastName
HAVING count(b.id) > 1
)
I stuck in the middle of my implementation below where I am not able to find out how to select multiple columns in the subquery. Please see my comment in the code (at 3rd line).
Subquery<PortraitVO> portraitSubQuery = criteriaQuery.subquery(PortraitVO.class);
Root<PortraitVO> portraitRoot = portraitSubQuery.from(PortraitVO.class);
portraitSubQuery.select(portraitRoot); // Here I want to select multiple columns
portraitSubQuery.where(criteriaBuilder.and(criteriaBuilder.equal(portraitRoot.get(RestServiceConstants.FIRST_NAME), root.get(RestServiceConstants.FIRST_NAME)), criteriaBuilder.equal(portraitRoot.get(RestServiceConstants.LAST_NAME), root.get(RestServiceConstants.LAST_NAME))));
List<String> columnNames = new ArrayList<String>();
columnNames.add(RestServiceConstants.FIRST_NAME);
columnNames.add(RestServiceConstants.LAST_NAME);
List<Expression<?>> columnNamesExpression = columnNames.stream().map(x -> portraitRoot.get(x))
.collect(Collectors.toList());
portraitSubQuery.groupBy(columnNamesExpression);
portraitSubQuery.having(criteriaBuilder.gt(criteriaBuilder.count(portraitRoot), 1));
Please help me with this problem.

It doesn't make any sense to select two columns in a subselect used inside an exists. And I'm not sure if it is legal.
Just use one of the columns or a literal.

JPA does not allow to select multiple expressions in a subquery and in fact, selecting anything in a subquery that is wrapped with exists is pointless. Just select a constant value e.g. the integer 1.

Related

SQL query in postgres database using left joins

Okay so I've done some research and apparently, a left join can return more than 1 record based on the tables joined from the right.
my query is:
SELECT
ord.ID AS ord_id,
oli.sfid AS oli_sfid,
ord.HasMSISDN__c AS ord_HasMSISDN__c,
ord.dealer_code__c AS ord_dealer_code__c,
ord.recordtypeid AS ord_recordtypeid,
ord.order_number__c AS ord_order_number__c,
ord.status AS ord_status,
ord.opportunityid AS ord_opportunityid,
ord.sfid AS ord_sfid,
ord.cancelled_by__c AS ord_cancelled_by__c,
ord.cancelled_on__c AS ord_cancelled_on__c,
ord.created_by__c AS ord_created_by__c,
ord.created_on__c AS ord_created_on__c,
ord.docusign_email_address__c AS ord_docusign_email_address__c,
ord.esignature_resent_to__c AS ord_esignature_resent_to__c,
ord.esignature_resent_by__c AS ord_esignature_resent_by__c,
ord.esignature_resent_on__c AS ord_esignature_resent_on__c,
ord.pricebook2id AS ord_pricebook2id,
cont.opportunity__c AS cont_opportunity__c,
cont.sfid AS cont_sfid,
opp.isclosed AS opp_isclosed,
opp.sfid AS opp_sfid,
opp.recordtypeid AS opp_recordtypeid,
opp.pricebook2id AS opp_pricebook2id,
accban.sfid AS accban_sfid,
accban.ban__c AS accban_ban__c,
usr.sfid AS usr_sfid
FROM fullsbxsalesforce.order ord
LEFT JOIN fullsbxsalesforce.contract cont ON ord.contractid = cont.sfid
LEFT JOIN fullsbxsalesforce.opportunity opp ON cont.opportunity__c = opp.sfid
LEFT JOIN fullsbxsalesforce.user usr ON (ord.dealer_code__c = usr.dealer_code_bd__c OR ord.dealer_code__c = usr.Dealer_Code_Co_Sell__c OR ord.dealer_code__c = usr.Rep_Dealer_Code__c OR ord.dealer_code__c = usr.dealer_code_secondary__c) LEFT JOIN fullsbxsalesforce.account_ban_tax_id__c accban ON ord.ban_number__c = accban.ban__c
LEFT JOIN fullsbxsalesforce.orderitem oli ON ord.sfid = oli.orderid
WHERE ord.sfid = 'SPECIFIC ID'
Initially, I was under the impression that this would return 1 row. I am mistaken, it returns 3 rows because there are 3 different OLI's attached to the order. How can I ensure, or change my logic so that either, I am returned with a collection of OLI's in the same order or only return the first OLI so that I'm not dealing with 3 duplicates

If you want your rows returned in the same order, just add an ORDER BY <col_name_list> clause at the very end of your query.
Is an OLI a unique value? Which table defines an OLI?
If you always want this query to always return one single row, just add a LIMIT 1 to the end of your query.
If your query returns multiple OLI's and you only want one row per OLI, then you can use a window function:
SELECT ...
FROM (
-- Your initial query with new field added
SELECT ...
ROW_NUMBER() OVER(PARTITION BY OLI_field_name ORDER BY <ordering_clause>) AS RowRank
FROM ...
) src
WHERE RowRank = 1
This will return one row per <OLI_field_name>.
Update
If you want to just have one row per OLI and keep all the detailed info, use the window function method. Something like this:
SELECT *
FROM (
SELECT
ord.ID AS ord_id,
oli.sfid AS oli_sfid,
ord.HasMSISDN__c AS ord_HasMSISDN__c,
ord.dealer_code__c AS ord_dealer_code__c,
ord.recordtypeid AS ord_recordtypeid,
ord.order_number__c AS ord_order_number__c,
ord.status AS ord_status,
ord.opportunityid AS ord_opportunityid,
ord.sfid AS ord_sfid,
ord.cancelled_by__c AS ord_cancelled_by__c,
ord.cancelled_on__c AS ord_cancelled_on__c,
ord.created_by__c AS ord_created_by__c,
ord.created_on__c AS ord_created_on__c,
ord.docusign_email_address__c AS ord_docusign_email_address__c,
ord.esignature_resent_to__c AS ord_esignature_resent_to__c,
ord.esignature_resent_by__c AS ord_esignature_resent_by__c,
ord.esignature_resent_on__c AS ord_esignature_resent_on__c,
ord.pricebook2id AS ord_pricebook2id,
cont.opportunity__c AS cont_opportunity__c,
cont.sfid AS cont_sfid,
opp.isclosed AS opp_isclosed,
opp.sfid AS opp_sfid,
opp.recordtypeid AS opp_recordtypeid,
opp.pricebook2id AS opp_pricebook2id,
accban.sfid AS accban_sfid,
accban.ban__c AS accban_ban__c,
usr.sfid AS usr_sfid,
ROW_NUMBER() OVER(PARTITION BY oli.sfid ORDER BY <order_col>) AS RowRank -- Assigns a rank to each row with the same oli.sfid value
FROM fullsbxsalesforce.order ord
LEFT JOIN fullsbxsalesforce.contract cont ON ord.contractid = cont.sfid
LEFT JOIN fullsbxsalesforce.opportunity opp ON cont.opportunity__c = opp.sfid
LEFT JOIN fullsbxsalesforce.user usr ON (ord.dealer_code__c = usr.dealer_code_bd__c OR ord.dealer_code__c = usr.Dealer_Code_Co_Sell__c OR ord.dealer_code__c = usr.Rep_Dealer_Code__c OR ord.dealer_code__c = usr.dealer_code_secondary__c)
LEFT JOIN fullsbxsalesforce.account_ban_tax_id__c accban ON ord.ban_number__c = accban.ban__c
LEFT JOIN fullsbxsalesforce.orderitem oli ON ord.sfid = oli.orderid
WHERE ord.sfid = 'SPECIFIC ID'
) src
WHERE RowRank = 1 -- Only get one row per oli.sfid value
This assumes that oli.sfid is the OLI ID
Just change the outer SELECT * to return all the fields except RowRank. Also, modify the <order_col> value to determine which row you want to return for each oli.sfid.

Not very nice, but you can set a subquery as a table in your FROM clause instead of doing the left join with oli:
, (select * from fullsbxsalesforce.orderitem WHERE ord.sfid = orderid limit 1) oli

I would question whether or not you only want one record? I see two potential answers:
1. You want all the Order Items
That you are selecting from the OrderItems table implies you want records from there. If three records match your results, then it seems illogical that you would want to arbitrarily ignore some?
(I've seen this, and done it, but it is indicative of a problem)
2. You don't want Order Items at all
This seems more likely, based on your willingness to just discard the data.
That you are willing to just discard the data all together would imply that you don't actually want it in the first place. If you don't want it, just don't include the table at all.
Conclusion
Looking at your query I am guessing you are in case #2. The issue more likely that you have included a table unnecessarily.
Where you say oli.sfid AS oli_sfid, did you mean to get the sfid of the order? If so, you no longer need the join on OrderItems.
If after reading all of that, you are still sure you want just one, totally arbitrary, item from the order, order by and limit (as suggested by others) are the solution.
3. Edit: A third scenario
After reading a comment by the OP: If all that is being attempted to verify that OrderItems exist, aggregation may be another way to go:
SELECT ord.ID AS ord_id,
count(*) AS oli_count,
ord.HasMSISDN__c AS ord_HasMSISDN__c,
...
accban.ban__c AS accban_ban__c,
usr.sfid AS usr_sfid
FROM order ord
...
LEFT JOIN orderitem oli ON ord.sfid = oli.orderid
WHERE ord.sfid = 'SPECIFIC ID'
GROUP BY
ord.ID,
ord.HasMSISDN__c,
...
accban.ban__c,
usr.sfid
Though in this case, the group by would be difficult and painful to maintain.
If the objective is to ignore Orders that do not have items associated with them, then you don't want a left join, you want an inner join:
SELECT ord.ID AS ord_id,
...
oli.sfid AS oli_sfid,
usr.sfid AS usr_sfid
FROM fullsbxsalesforce.order ord
INNER JOIN orderitem oli ON ord.sfid = oli.orderid
LEFT JOIN contract cont ON ord.contractid = cont.sfid
LEFT JOIN opportunity opp ON cont.opportunity__c = opp.sfid
LEFT JOIN user usr ON ord.dealer_code__c = usr.dealer_code_bd__c
OR ord.dealer_code__c = usr.Dealer_Code_Co_Sell__c
OR ord.dealer_code__c = usr.Rep_Dealer_Code__c
OR ord.dealer_code__c = usr.dealer_code_secondary__c
LEFT JOIN account_ban_tax_id__c accban ON ord.ban_number__c = accban.ban__c
WHERE ord.sfid = 'SPECIFIC ID'
Note the change of position of orderitem in the table sequence.

SQL server query returns but function does not

In my Java based web project, I have made one recursive query as below which runs perfectly fine and returns list of ids.
WITH treeResult(id) AS
(SELECT pt.id FROM myschema.art_artwork_tree AS pt WHERE pt.id in
(select node_id from myschema.art_brand_user_mapping where emp_id = $1)
UNION ALL
SELECT pa.id FROM treeResult AS p, myschema.art_artwork_tree AS pa
WHERE pa.parent_node = p.id and pa.site_id = $2) SELECT id FROM treeResult AS n
);
Now, I want to use it in JPQL query. So, I have made function as below.
USE [darshandb]
GO
DROP FUNCTION [dbo].[testfunction]
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE FUNCTION [dbo].[testfunction] (#empId INT,#siteId INT)
RETURNS TABLE
WITH SCHEMABINDING
AS
RETURN
(
WITH treeResult(id) AS
(SELECT pt.id FROM myschema.art_artwork_tree AS pt WHERE pt.id in
(select node_id from myschema.art_brand_user_mapping where emp_id = $1)
UNION ALL
SELECT pa.id FROM treeResult AS p, myschema.art_artwork_tree AS pa
WHERE pa.parent_node = p.id and pa.site_id = $2) SELECT id FROM treeResult AS n
);
GO
When I am trying to execute function, it does not return any value.
SELECT * FROM [dbo].[testfunction] (4,3);
Please help me, what I have done wrong.

I think your problem is the use of $1 and $2 in your function query. Just use the original parameter names in your table valued function.
So, replace $1 by #empId and $2 by #siteId in your user defined function.

How to query for number of records in select with "group by" clause in JPA/EclipseLink?

Suppose you have a following JPA query:
select car.year, car.month, count(car) from Car car group by car.year, car.month
Before we query for results, we need to know how many records this query will return (for pagination, UI and so on). In other words we need something like this:
select count(*) from
(select car.year, car.month, count(car)
from Car car group by car.year)
But JPA/EclipseLink does not support subqueries in "from" clause. It there a way around it?
(Of course you can use plain SQL and native queries, but this is not an option for us)

A portable JPA solution:
select count(*) from Car c where c.id in
(select MIN(car.id) from Car car group by car.year, car.month)
You could also go with something like:
select COUNT(DISTINCT CONCAT(car.year, "#", car.month)) from car
but I expect this to be less performant due to operations with textual values.

What about:
select count(distinct car.year) from car

I have another approach to solve this issue . by using my approach you don't need to know the no of rows this query is going to return.
here it is your solution :-
you going to need two variables
1) pageNo (your page no should be 1 for first request to data base and proceeding request it should be incremental like 2 ,3 , 4 5 ).
2) pageSize.
int start = 0;
if(pageNo!=null && pageSize!=null){
start = (pageNo-1) * pageSize;
}else{
pageSize = StaticVariable.MAX_PAGE_SIZE; // default page size if page no and page size are missing
}
your query
em.createquery("your query ")
.setfirstResult(start)
.setMaxResult(pageSize)
.getResultList();

As #chris pointed out EclipseLink supports subqueries. But the subquery can't be the first one in the from-clause.
So I came up with the following workaround which is working:
select count(1) from Dual dual,
(select car.year, car.month, count(car)
from Car car group by car.year) data
count(1) is important as count(data) would not work
You have to add an entity Dual (If your database does not have a DUAL table, create one with just one record.)
This works but I still consider it a workaround that would only work if you allowed to create the DUAL table.

Simply you can use setFirstResult and setMaxResult to set record bound for query ,also use size of list to return count of records that query runs. like this :
Query query = em.createQuery("SELECT d.name, COUNT(t) FROM Department d JOIN
d.teachers t GROUP BY d.name");
//query.setFirstResult(5);
//query.setMaxResult(15); this will return 10 (from 5 to 15) record after query executed.
List<Object[]> results = query.getResultList();
for (int i = 0; i < results.size(); i++) {
Object[] arr = results.get(i);
for (int j = 0; j < arr.length; j++) {
System.out.print(arr[j] + " ");
}
System.out.println();
}
-----Updated Section------
JPA does not support sub-selects in the FROM clause but EclipseLink 2.4 current milestones builds does have this support.
See, http://wiki.eclipse.org/EclipseLink/UserGuide/JPA/Basic_JPA_Development/Querying/JPQL#Sub-selects_in_FROM_clause
You can probably rewrite the query with just normal joins though.
Maybe,
Select a, size(a.bs) from A a
or
Select a, count(b) from A a join a.bs b group by a
I hope this helps you.

Ordering by a max or a min from another table

I have a table that consists of a unique id, and a few other attributes. It holds "schedules". Then I have another table that holds a list of all the times each schedule has or will "fire". This isn't the exact schema, but it's close:
create table schedule (
id varchar(40) primary key,
attr1 int,
attr2 varchar(20)
);
create table schedule_times (
id varchar(40) foreign key schedule(id),
fire_date date
);
I want to query the schedule table, getting the attributes and the next and previous fire_dates, in Java, sometimes ordering on one of the attributes, but sometimes ordering on either previous fire_date or the next fire_date. Ordering by the attributes is easy, I just stick an "order by" into the string while I'm building my prepared statement. I'm not even sure how to go about selecting the last fire_date and the next one in a single query - I know that I can find the next fire_date for a given id by doing a
SELECT min(fire_date)
FROM schedule_times
WHERE id = ? AND
fire_date > sysdate;
and the similar thing for previous fire_date using max() and fire_date < sysdate. I'm just drawing a blank on how to incorporate that into a single select from the schedule so I can get both next and previous fire_date in one shot, and also how to order by either of those attributes.

You can do that using two sub-queries in Left Joins.
This has the advantage of returning NULL for your fire_dates if there is no next/previous schedule.
Select id, attr1, attr2, next_fire_date, previous_fire_date
From schedule s
Left Join ( Select id, Min(fire_date) As next_fire_date
From schedule_times st
Where st.fire_date > Sysdate
Group By id ) n
On ( n.id = s.id )
Left Join ( Select id, Max(fire_date) As previous_fire_date
From schedule_times st
Where st.fire_date < Sysdate
Group By id ) p
On ( p.id = s.id )
You can then add your ORDER BY on next_fire_date or previous_fire_date.
If performance matters, create a compound index on schedule_times( id, fire_date ), this will allow the sub-queries to read only from this index.

Try something like this:
select schedule.*,
(
select max(si.fire_date) from schedule_times si where si.id = schedule.id and si.fire_date < sysdate
) as prevfire,
(
select min(si.fire_date) from schedule_times si where si.id = schedule.id and si.fire_date > sysdate
) as nextfire
from schedule
where id = ?
order by attr1

Modify the query to
SELECT Distinct S.ID, ATTR1, ATTR2,
LEAD(FIRE_DATE, 1, SYSDATE) OVER (PARTITION BY S.ID ORDER BY S.ID) NEXT_FIRE_DATE,
LAG(FIRE_DATE, 1, SYADATE) OVER (PARTITION BY S.ID ORDER BY S.ID) PREV_FIRE_DATE
FROM SCHEDULE S,SCHEDULE_TIMES ST WHERE ST.ID=S.ID;
THIS WAS SIMPLE QUERY. YOU CAN TRY THIS.
LEAD has the ability to compute an expression on the next rows (rows which are going to come after the current row) and return the value to the current row. The general syntax of LEAD is shown below:
LEAD (sql_expr, offset, default) OVER (analytic_clause)
sql_expr is the expression to compute from the leading row.
offset is the index of the leading row relative to the current
row. offset is a positive integer
with default 1.
default is the value to return if the offset points to a row
outside the partition range.
The syntax of LAG is similar except that the offset for LAG goes into the previous rows.

Converting SQL with subselect in select to HQL

I have the following SQL that I am having problems converting to HQL. A NPE is getting thrown -- which I think has something to do with the SUM function. Also, I'd like to sort on the subselect alias -- is this possible?
SQL (subselect):
SELECT q.title, q.author_id,
(SELECT IFNULL(SUM(IF(vote_up=true,1,-1)), 0)
FROM vote WHERE question_id = q.id) AS votecount
FROM question q ORDER BY votecount DESC
HQL (not working)
SELECT q,
(SELECT COALESCE(SUM(IF(v.voteUp=true,1,-1)), 0)
FROM Vote v WHERE v.question = q) AS votecount
FROM Question AS q
LEFT JOIN q.author u
LEFT JOIN u.blockedUsers ub
WHERE q.dateCreated BETWEEN :week AND :now
AND u.id NOT IN (
SELECT ub.blocked FROM UserBlock AS ub WHERE ub.blocker = :loggedInUser
)
AND (u.blockedUsers IS EMPTY OR ub.blocked != :loggedInUser)
ORDER BY votecount DESC

Here is the working HQL if anyone is interested:
SELECT q,
(SELECT COALESCE(SUM(CASE v.voteUp WHEN true THEN 1 ELSE -1 END), 0)
FROM Vote v WHERE v.question = q) AS votecount
FROM Question AS q
LEFT JOIN q.author u
LEFT JOIN u.blockedUsers ub
WHERE q.dateCreated BETWEEN :week AND :now
AND u.id NOT IN (
SELECT ub.blocked FROM UserBlock AS ub WHERE ub.blocker =:loggedInUser
)
AND (u.blockedUsers IS EMPTY OR ub.blocked !=:loggedInUser)
ORDER BY col_1_0_ DESC
Notice the ORDER BY col_1_0_
There is an open issue with Hibernate -- it does not correctly parse aliases and since the aliases are renamed in the query, an error will be thrown. So, col_1_0_ is basically a workaround --it's the name Hibernate generates.
See issue:
http://opensource.atlassian.com/projects/hibernate/browse/HHH-892

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.