SQL query in postgres database using left joins - java

Okay so I've done some research and apparently, a left join can return more than 1 record based on the tables joined from the right.
my query is:
SELECT
ord.ID AS ord_id,
oli.sfid AS oli_sfid,
ord.HasMSISDN__c AS ord_HasMSISDN__c,
ord.dealer_code__c AS ord_dealer_code__c,
ord.recordtypeid AS ord_recordtypeid,
ord.order_number__c AS ord_order_number__c,
ord.status AS ord_status,
ord.opportunityid AS ord_opportunityid,
ord.sfid AS ord_sfid,
ord.cancelled_by__c AS ord_cancelled_by__c,
ord.cancelled_on__c AS ord_cancelled_on__c,
ord.created_by__c AS ord_created_by__c,
ord.created_on__c AS ord_created_on__c,
ord.docusign_email_address__c AS ord_docusign_email_address__c,
ord.esignature_resent_to__c AS ord_esignature_resent_to__c,
ord.esignature_resent_by__c AS ord_esignature_resent_by__c,
ord.esignature_resent_on__c AS ord_esignature_resent_on__c,
ord.pricebook2id AS ord_pricebook2id,
cont.opportunity__c AS cont_opportunity__c,
cont.sfid AS cont_sfid,
opp.isclosed AS opp_isclosed,
opp.sfid AS opp_sfid,
opp.recordtypeid AS opp_recordtypeid,
opp.pricebook2id AS opp_pricebook2id,
accban.sfid AS accban_sfid,
accban.ban__c AS accban_ban__c,
usr.sfid AS usr_sfid
FROM fullsbxsalesforce.order ord
LEFT JOIN fullsbxsalesforce.contract cont ON ord.contractid = cont.sfid
LEFT JOIN fullsbxsalesforce.opportunity opp ON cont.opportunity__c = opp.sfid
LEFT JOIN fullsbxsalesforce.user usr ON (ord.dealer_code__c = usr.dealer_code_bd__c OR ord.dealer_code__c = usr.Dealer_Code_Co_Sell__c OR ord.dealer_code__c = usr.Rep_Dealer_Code__c OR ord.dealer_code__c = usr.dealer_code_secondary__c) LEFT JOIN fullsbxsalesforce.account_ban_tax_id__c accban ON ord.ban_number__c = accban.ban__c
LEFT JOIN fullsbxsalesforce.orderitem oli ON ord.sfid = oli.orderid
WHERE ord.sfid = 'SPECIFIC ID'
Initially, I was under the impression that this would return 1 row. I am mistaken, it returns 3 rows because there are 3 different OLI's attached to the order. How can I ensure, or change my logic so that either, I am returned with a collection of OLI's in the same order or only return the first OLI so that I'm not dealing with 3 duplicates

If you want your rows returned in the same order, just add an ORDER BY <col_name_list> clause at the very end of your query.
Is an OLI a unique value? Which table defines an OLI?
If you always want this query to always return one single row, just add a LIMIT 1 to the end of your query.
If your query returns multiple OLI's and you only want one row per OLI, then you can use a window function:
SELECT ...
FROM (
-- Your initial query with new field added
SELECT ...
ROW_NUMBER() OVER(PARTITION BY OLI_field_name ORDER BY <ordering_clause>) AS RowRank
FROM ...
) src
WHERE RowRank = 1
This will return one row per <OLI_field_name>.
Update
If you want to just have one row per OLI and keep all the detailed info, use the window function method. Something like this:
SELECT *
FROM (
SELECT
ord.ID AS ord_id,
oli.sfid AS oli_sfid,
ord.HasMSISDN__c AS ord_HasMSISDN__c,
ord.dealer_code__c AS ord_dealer_code__c,
ord.recordtypeid AS ord_recordtypeid,
ord.order_number__c AS ord_order_number__c,
ord.status AS ord_status,
ord.opportunityid AS ord_opportunityid,
ord.sfid AS ord_sfid,
ord.cancelled_by__c AS ord_cancelled_by__c,
ord.cancelled_on__c AS ord_cancelled_on__c,
ord.created_by__c AS ord_created_by__c,
ord.created_on__c AS ord_created_on__c,
ord.docusign_email_address__c AS ord_docusign_email_address__c,
ord.esignature_resent_to__c AS ord_esignature_resent_to__c,
ord.esignature_resent_by__c AS ord_esignature_resent_by__c,
ord.esignature_resent_on__c AS ord_esignature_resent_on__c,
ord.pricebook2id AS ord_pricebook2id,
cont.opportunity__c AS cont_opportunity__c,
cont.sfid AS cont_sfid,
opp.isclosed AS opp_isclosed,
opp.sfid AS opp_sfid,
opp.recordtypeid AS opp_recordtypeid,
opp.pricebook2id AS opp_pricebook2id,
accban.sfid AS accban_sfid,
accban.ban__c AS accban_ban__c,
usr.sfid AS usr_sfid,
ROW_NUMBER() OVER(PARTITION BY oli.sfid ORDER BY <order_col>) AS RowRank -- Assigns a rank to each row with the same oli.sfid value
FROM fullsbxsalesforce.order ord
LEFT JOIN fullsbxsalesforce.contract cont ON ord.contractid = cont.sfid
LEFT JOIN fullsbxsalesforce.opportunity opp ON cont.opportunity__c = opp.sfid
LEFT JOIN fullsbxsalesforce.user usr ON (ord.dealer_code__c = usr.dealer_code_bd__c OR ord.dealer_code__c = usr.Dealer_Code_Co_Sell__c OR ord.dealer_code__c = usr.Rep_Dealer_Code__c OR ord.dealer_code__c = usr.dealer_code_secondary__c)
LEFT JOIN fullsbxsalesforce.account_ban_tax_id__c accban ON ord.ban_number__c = accban.ban__c
LEFT JOIN fullsbxsalesforce.orderitem oli ON ord.sfid = oli.orderid
WHERE ord.sfid = 'SPECIFIC ID'
) src
WHERE RowRank = 1 -- Only get one row per oli.sfid value
This assumes that oli.sfid is the OLI ID
Just change the outer SELECT * to return all the fields except RowRank. Also, modify the <order_col> value to determine which row you want to return for each oli.sfid.

Not very nice, but you can set a subquery as a table in your FROM clause instead of doing the left join with oli:
, (select * from fullsbxsalesforce.orderitem WHERE ord.sfid = orderid limit 1) oli

I would question whether or not you only want one record? I see two potential answers:
1. You want all the Order Items
That you are selecting from the OrderItems table implies you want records from there. If three records match your results, then it seems illogical that you would want to arbitrarily ignore some?
(I've seen this, and done it, but it is indicative of a problem)
2. You don't want Order Items at all
This seems more likely, based on your willingness to just discard the data.
That you are willing to just discard the data all together would imply that you don't actually want it in the first place. If you don't want it, just don't include the table at all.
Conclusion
Looking at your query I am guessing you are in case #2. The issue more likely that you have included a table unnecessarily.
Where you say oli.sfid AS oli_sfid, did you mean to get the sfid of the order? If so, you no longer need the join on OrderItems.
If after reading all of that, you are still sure you want just one, totally arbitrary, item from the order, order by and limit (as suggested by others) are the solution.
3. Edit: A third scenario
After reading a comment by the OP: If all that is being attempted to verify that OrderItems exist, aggregation may be another way to go:
SELECT ord.ID AS ord_id,
count(*) AS oli_count,
ord.HasMSISDN__c AS ord_HasMSISDN__c,
...
accban.ban__c AS accban_ban__c,
usr.sfid AS usr_sfid
FROM order ord
...
LEFT JOIN orderitem oli ON ord.sfid = oli.orderid
WHERE ord.sfid = 'SPECIFIC ID'
GROUP BY
ord.ID,
ord.HasMSISDN__c,
...
accban.ban__c,
usr.sfid
Though in this case, the group by would be difficult and painful to maintain.
If the objective is to ignore Orders that do not have items associated with them, then you don't want a left join, you want an inner join:
SELECT ord.ID AS ord_id,
...
oli.sfid AS oli_sfid,
usr.sfid AS usr_sfid
FROM fullsbxsalesforce.order ord
INNER JOIN orderitem oli ON ord.sfid = oli.orderid
LEFT JOIN contract cont ON ord.contractid = cont.sfid
LEFT JOIN opportunity opp ON cont.opportunity__c = opp.sfid
LEFT JOIN user usr ON ord.dealer_code__c = usr.dealer_code_bd__c
OR ord.dealer_code__c = usr.Dealer_Code_Co_Sell__c
OR ord.dealer_code__c = usr.Rep_Dealer_Code__c
OR ord.dealer_code__c = usr.dealer_code_secondary__c
LEFT JOIN account_ban_tax_id__c accban ON ord.ban_number__c = accban.ban__c
WHERE ord.sfid = 'SPECIFIC ID'
Note the change of position of orderitem in the table sequence.

Related

How to select multiple columns in subquery using JPA Criteria

I am trying to write the below query using JPA criteria but I am not able to select the multiple columns in a subquery.
SELECT a.id, a.firstName, a.lastName
FROM PORTRAIT a
JOIN (SELECT firstName, lastName
FROM PORTRAIT
GROUP BY firstName, lastName
HAVING count(id) > 1 ) b
ON b.firstName = a.firstName
AND b.lastName = a.lastName
ORDER BY a.lastName asc
or
SELECT a.id, a.FIRSTNAME, a.LASTNAME
FROM PORTRAIT a where exists (
SELECT b.firstName, b.lastName
FROM PORTRAIT b
WHERE b.firstName = a.firstName
AND b.lastName = a.lastName
GROUP BY b.firstName, b.lastName
HAVING count(b.id) > 1
)
I stuck in the middle of my implementation below where I am not able to find out how to select multiple columns in the subquery. Please see my comment in the code (at 3rd line).
Subquery<PortraitVO> portraitSubQuery = criteriaQuery.subquery(PortraitVO.class);
Root<PortraitVO> portraitRoot = portraitSubQuery.from(PortraitVO.class);
portraitSubQuery.select(portraitRoot); // Here I want to select multiple columns
portraitSubQuery.where(criteriaBuilder.and(criteriaBuilder.equal(portraitRoot.get(RestServiceConstants.FIRST_NAME), root.get(RestServiceConstants.FIRST_NAME)), criteriaBuilder.equal(portraitRoot.get(RestServiceConstants.LAST_NAME), root.get(RestServiceConstants.LAST_NAME))));
List<String> columnNames = new ArrayList<String>();
columnNames.add(RestServiceConstants.FIRST_NAME);
columnNames.add(RestServiceConstants.LAST_NAME);
List<Expression<?>> columnNamesExpression = columnNames.stream().map(x -> portraitRoot.get(x))
.collect(Collectors.toList());
portraitSubQuery.groupBy(columnNamesExpression);
portraitSubQuery.having(criteriaBuilder.gt(criteriaBuilder.count(portraitRoot), 1));
Please help me with this problem.
It doesn't make any sense to select two columns in a subselect used inside an exists. And I'm not sure if it is legal.
Just use one of the columns or a literal.
JPA does not allow to select multiple expressions in a subquery and in fact, selecting anything in a subquery that is wrapped with exists is pointless. Just select a constant value e.g. the integer 1.

How can I count a subquery with group by

I'm new here, I'm french, apologize my English please.
I have a grid with a HQL source.
The first HQL's query send the count of records, and the second query the data.
But the result of the first query isn't equal to the number of record of the second query.
The first query (return 26 records) :
SELECT
resou.res_book.res_book_id as record,
strset.str_set_cd as room,
concat(coalesce(person.firstname, ' '), ' ', coalesce(person.lastname, ' ')) as name,
prod_h.d_from_d as start_date,
prod_h.d_to_d as end_date,
(SUM(prod_h.amt_total_ivat) - SUM(pay_l.amt_paymt)) as total
FROM Com_site as site
inner join site.com_bu as bu
inner join bu.com_activ as activ
inner join activ.inv_head as head
inner join head.inv_person as person
inner join person.res_rooming as rooming
inner join rooming.res_resou as resou
inner join resou.str_set as strset
inner join person.inv_prod_h as prod_h
inner join head.inv_pay_l as pay_l
inner join prod_h.pdt_prod as prod
WHERE
head.tp_folio_tp = 0 and
prod_h.d_to_d <= '2020-07-01' and
site.com_site_id = 1 and
prod_h.inv_accou_itm.inv_accou_itm_id is null and
prod.b_rent_bl = false and
'2020-07-01' between resou.d_from_d and resou.d_to_d
GROUP BY
head.inv_head_id
, person.firstname
, person.lastname
, strset.str_set_cd
, resou.res_book.res_book_id
, prod_h.d_from_d
, prod_h.d_to_d
, head.tp_folio_tp
, prod.pdt_prod_id
And the second query (return 1 record but the value is 316 (not 26)) :
select
(
SELECT
COUNT(head.inv_head_id) as counted
from site.com_bu as bu
inner join bu.com_activ as activ
inner join activ.inv_head as head
inner join head.inv_person as person
inner join person.res_rooming as rooming
inner join rooming.res_resou as resou
inner join resou.str_set as strset
inner join person.inv_prod_h as prod_h
inner join head.inv_pay_l as pay_l
inner join prod_h.pdt_prod as prod
WHERE
head.tp_folio_tp = 0 and
prod_h.d_to_d <= '2020-07-01' and
site.com_site_id = 1 and
prod_h.inv_accou_itm.inv_accou_itm_id is null and
prod.b_rent_bl = false and
'2020-07-01' between resou.d_from_d and resou.d_to_d
) as counted
from Com_site as site
I tried:
COUNT(COUNT(head.inv_head_id)), COUNT(*the subquery here*)
but nothing works...
Anyone can please help me ?
Thank's you in advance.
I don't quite understand why you want to determine the count first and then query all the data, you could just determine the count by counting the amount of fetched entities, so there is no need for a second query.
In case you are trying to implement pagination, I can recommend you take a look at Blaze-Persistence which has excellent support for creating efficient count queries from a base query and bounded counting, but also supports various paginations techniques: https://persistence.blazebit.com/documentation/core/manual/en_US/index.html#pagination
Thank you for helping me ! I found that my query is tottaly wrong, so I rewrote the query without group by, and it's works ! The first query send the correct count of record of the second query.

How to query for number of records in select with "group by" clause in JPA/EclipseLink?

Suppose you have a following JPA query:
select car.year, car.month, count(car) from Car car group by car.year, car.month
Before we query for results, we need to know how many records this query will return (for pagination, UI and so on). In other words we need something like this:
select count(*) from
(select car.year, car.month, count(car)
from Car car group by car.year)
But JPA/EclipseLink does not support subqueries in "from" clause. It there a way around it?
(Of course you can use plain SQL and native queries, but this is not an option for us)
A portable JPA solution:
select count(*) from Car c where c.id in
(select MIN(car.id) from Car car group by car.year, car.month)
You could also go with something like:
select COUNT(DISTINCT CONCAT(car.year, "#", car.month)) from car
but I expect this to be less performant due to operations with textual values.
What about:
select count(distinct car.year) from car
I have another approach to solve this issue . by using my approach you don't need to know the no of rows this query is going to return.
here it is your solution :-
you going to need two variables
1) pageNo (your page no should be 1 for first request to data base and proceeding request it should be incremental like 2 ,3 , 4 5 ).
2) pageSize.
int start = 0;
if(pageNo!=null && pageSize!=null){
start = (pageNo-1) * pageSize;
}else{
pageSize = StaticVariable.MAX_PAGE_SIZE; // default page size if page no and page size are missing
}
your query
em.createquery("your query ")
.setfirstResult(start)
.setMaxResult(pageSize)
.getResultList();
As #chris pointed out EclipseLink supports subqueries. But the subquery can't be the first one in the from-clause.
So I came up with the following workaround which is working:
select count(1) from Dual dual,
(select car.year, car.month, count(car)
from Car car group by car.year) data
count(1) is important as count(data) would not work
You have to add an entity Dual (If your database does not have a DUAL table, create one with just one record.)
This works but I still consider it a workaround that would only work if you allowed to create the DUAL table.
Simply you can use setFirstResult and setMaxResult to set record bound for query ,also use size of list to return count of records that query runs. like this :
Query query = em.createQuery("SELECT d.name, COUNT(t) FROM Department d JOIN
d.teachers t GROUP BY d.name");
//query.setFirstResult(5);
//query.setMaxResult(15); this will return 10 (from 5 to 15) record after query executed.
List<Object[]> results = query.getResultList();
for (int i = 0; i < results.size(); i++) {
Object[] arr = results.get(i);
for (int j = 0; j < arr.length; j++) {
System.out.print(arr[j] + " ");
}
System.out.println();
}
-----Updated Section------
JPA does not support sub-selects in the FROM clause but EclipseLink 2.4 current milestones builds does have this support.
See, http://wiki.eclipse.org/EclipseLink/UserGuide/JPA/Basic_JPA_Development/Querying/JPQL#Sub-selects_in_FROM_clause
You can probably rewrite the query with just normal joins though.
Maybe,
Select a, size(a.bs) from A a
or
Select a, count(b) from A a join a.bs b group by a
I hope this helps you.

JPA 2 criteria API: How to select values from various joined tables without using Metamodel?

I have the following type of query that I wish to convert into a jpa criteria query and the following table structure:
Table A 1-->1 Table B 1<--* Table C (proceedings) *-->1 Table D(prcoeedingsstatus)
-------- -------- ------- -------
aID bID cID dID
... .... timestamp textValue
f_bID .... f_bID
f_dID
1 A has 1 B, 1 B has many proceedings and each proceeding has a proceedingstatus.
SELECT a.*
FROM ((a LEFT JOIN b ON a.f_b = b.id)
LEFT JOIN proceedings ON b.id = proceedings.f_b)
RIGHT JOIN proceedingsstatus ON proceedings.f_d = proceedingsstatus.id
WHERE d.textValue IN ("some unique text")
AND c.timestamp BETWEEN 'somedate' AND 'anotherdate'
When I now try to do something like this for the predicates:
Predicate conditions = (root.join("tableB")
.joinList("proceedings")
.join("proceedingsstatus").get("textValue"))
.in(constraintList.getSelectedValues());
Predicate time = cb.between((root.join("tableB")
.joinList("proceedings")
.<Date>get("timestamp")), dt1.toDate(), dt2.toDate());
constraints = cb.and(conditions, time);
Right now it selects entries A where there is at least 1 occurrence of the right proceedingsstatus according to the conditions-predicate if in any of A's proceedings the 'timestamp' matches the time-predicate that I built.
So it would also select an entry A when C.timestamp is correct for a proceeding with the wrong textValue in D, if there is at least one entry C belonging to A with the right textvalue in D.
How can I change it so that it only selects A's where the proceedingsstatus has the right value AND the time of proceeds is correct?
Reuse the joins instead of creating new ones for each predicate.
Join proceedings = root.join("tableB").joinList("proceedings");

Ordering by a max or a min from another table

I have a table that consists of a unique id, and a few other attributes. It holds "schedules". Then I have another table that holds a list of all the times each schedule has or will "fire". This isn't the exact schema, but it's close:
create table schedule (
id varchar(40) primary key,
attr1 int,
attr2 varchar(20)
);
create table schedule_times (
id varchar(40) foreign key schedule(id),
fire_date date
);
I want to query the schedule table, getting the attributes and the next and previous fire_dates, in Java, sometimes ordering on one of the attributes, but sometimes ordering on either previous fire_date or the next fire_date. Ordering by the attributes is easy, I just stick an "order by" into the string while I'm building my prepared statement. I'm not even sure how to go about selecting the last fire_date and the next one in a single query - I know that I can find the next fire_date for a given id by doing a
SELECT min(fire_date)
FROM schedule_times
WHERE id = ? AND
fire_date > sysdate;
and the similar thing for previous fire_date using max() and fire_date < sysdate. I'm just drawing a blank on how to incorporate that into a single select from the schedule so I can get both next and previous fire_date in one shot, and also how to order by either of those attributes.
You can do that using two sub-queries in Left Joins.
This has the advantage of returning NULL for your fire_dates if there is no next/previous schedule.
Select id, attr1, attr2, next_fire_date, previous_fire_date
From schedule s
Left Join ( Select id, Min(fire_date) As next_fire_date
From schedule_times st
Where st.fire_date > Sysdate
Group By id ) n
On ( n.id = s.id )
Left Join ( Select id, Max(fire_date) As previous_fire_date
From schedule_times st
Where st.fire_date < Sysdate
Group By id ) p
On ( p.id = s.id )
You can then add your ORDER BY on next_fire_date or previous_fire_date.
If performance matters, create a compound index on schedule_times( id, fire_date ), this will allow the sub-queries to read only from this index.
Try something like this:
select schedule.*,
(
select max(si.fire_date) from schedule_times si where si.id = schedule.id and si.fire_date < sysdate
) as prevfire,
(
select min(si.fire_date) from schedule_times si where si.id = schedule.id and si.fire_date > sysdate
) as nextfire
from schedule
where id = ?
order by attr1
Modify the query to
SELECT Distinct S.ID, ATTR1, ATTR2,
LEAD(FIRE_DATE, 1, SYSDATE) OVER (PARTITION BY S.ID ORDER BY S.ID) NEXT_FIRE_DATE,
LAG(FIRE_DATE, 1, SYADATE) OVER (PARTITION BY S.ID ORDER BY S.ID) PREV_FIRE_DATE
FROM SCHEDULE S,SCHEDULE_TIMES ST WHERE ST.ID=S.ID;
THIS WAS SIMPLE QUERY. YOU CAN TRY THIS.
LEAD has the ability to compute an expression on the next rows (rows which are going to come after the current row) and return the value to the current row. The general syntax of LEAD is shown below:
LEAD (sql_expr, offset, default) OVER (analytic_clause)
sql_expr is the expression to compute from the leading row.
offset is the index of the leading row relative to the current
row. offset is a positive integer
with default 1.
default is the value to return if the offset points to a row
outside the partition range.
The syntax of LAG is similar except that the offset for LAG goes into the previous rows.

Categories