Related
I have 2 tables Customer and Orders.
1st question:
That is a master table for Customers that have a few columns like Customer number, customer name, Active flag, etc. Table may contain 2 or more records for the same customer number but as per the business logic only 1 records at a time should ideally be ACTIVE. I need to find customers that have only 1 record and it should be active.
query that I have written:
select customer_number, count(*)
from customers c
where active = false
group by customer_number
having count(*) = 1;
This returns me the customers that have 2 records and only 1 is NOT ACTIVE.
Question 2:
Apart from customer table we have another table that is Orders table, it contains columns like Customer number(as same in Customers table), deliver date, order number, insert time.
I need to find the customers whose ACTIVE is false, and have not given any orders since 180 days. (INSERT TIME::date - 180).
what I have tried is not giving me the desired output, as on back testing I found that the data is wrong
select om.customer_number,
c.customer_name,
om.deliverydate,
om.insert_time
from customers c, order_master om
where
om.customer_number in
(
select c2.customer_number
from customers c2
where c2.active = false
group by c2.customer_number
having count(*) =1
)
and c.customer_number = om.customer_number
group by om.customer_number, c.customer_name,
om.deliverydate, om.insert_time
having max(om.insert_time::date) < '2022-06-01' ;
The queries that I have tried, I have already mentioned them in my question. Please check that.
For the first question, find customers that have only 1 record and it should be active , you may use conditional aggregation or filtered count as the following:
select customer_number
from Customers c
group by customer_number
having count(*) = 1 and count(*) filter (where active) = 1;
For the second question, find the customers whose ACTIVE is false, and have not given any orders since 180 days, try the following:
select cu.customer_number
from order_master om join
(
select customer_number
from Customers c
group by customer_number
having count(*) filter (where active) = 0
) cu
on om.customer_number = cu.customer_number
group by cu.customer_number
having max(om.insert_time) < current_date - interval '180 day'
See a demo.
If you want to get all order details for the inactive customers, you may join the above query with the orders table as the following:
with inactive_cust as
(
select cu.customer_number, cu.customer_name
from order_master om join
(
select customer_number, customer_name
from Customers c
group by customer_number, customer_name
having count(*) filter (where active) = 0
) cu
on om.customer_number = cu.customer_number
group by cu.customer_number, cu.customer_name
having max(om.insert_time) < current_date - interval '180 day'
)
select c.customer_number, c.customer_name,
o.order_number, o.insert_time
from inactive_cust c join order_master o
on c.customer_number = o.customer_number
See a demo.
#Ahmed- Both of your queries worked fine.
However in the 2nd query I want to fetch additional data into it, so what I did was -
select om.customer_number, cu.customer_name, om.order_number ,om.insert_time
from order_master om join
(
select customer_number, customer_name
from Customers c
group by customer_number, customer_name
having count(*) filter (where active) = 0
) cu
on om.customer_number = cu.customer_number
group by om.customer_number , cu.customer_name, om.insert_time,om.order_number
having max(om.insert_time) < current_date - interval '180 day';
When I tried the query shared by you -
select om.customer_number
from order_master om join
(
select customer_number
from Customers c
group by customer_number
having count(*) filter (where active) = 0
) cu
on om.customer_number = cu.customer_number
group by om.customer_number
having max(om.insert_time) < current_date - interval '180 day';
Its giving me around 4K results, and when I am trying with my modifications, so after adding each column in the query the result count is increasing exponentially till 75K and more.
Also its showing me records for which max(om.insert_time) is much greater than 180 days
My query looks like this:
SELECT
nvl(dd,'TOTAL') "Subject",
SUM(cnt) "Count,
SUM(pct) AS "%"
FROM
(
SELECT
dd,
COUNT(1) cnt,
round(RATIO_TO_REPORT(COUNT(1) ) OVER() * 100,2) AS pct
FROM
student p,
student_subject a
WHERE
p.sId = a.sId
AND student_type IN (
'1',
'2'
)
AND dd IN (
'MATH',
'SCIENCE',
'HISTORY'
)
GROUP BY
dd
ORDER BY
1
)
GROUP BY
ROLLUP(dd)
ORDER BY
1;
My Output should look like this:
Subject Count %
MATH 33 23.2%
SCIENCE 24 11.46%
HISTORY 56 44.778%
TOTAL 113 85.4.2%
If a particular subject doesnt have data it should still provide the row with 0 values like below
Subject Count %
MATH 33 23.20%
SCIENCE 0 0.00%
HISTORY 56 44.77%
TOTAL 113 85.42%
What I am getting rightnow is below with no SCIENCE row which is not desired ,
Subject Count %
MATH 33 23.20%
HISTORY 56 44.77%
TOTAL 113 85.42%
What I did is I removed the dd IN clause "AND dd IN (
'MATH',
'SCIENCE',
'HISTORY'
)"
However I am not able to get to the another inner select to select the 3 subjects.
If i understand the datamodel correctly when a student is not enrolled to a subject an entry for the subject wouldn't exist in student_subject table, which means the missing subject is not present in the deficit table as well. Hence technically it is not possible to join these two tables and report for a column value that doesn't exist in either of them.
Now to solve this,i use WITH clause to create another table to hold all the desired subjects and perform an outer join with the result set retrieved.
I have tested this and it works perfectly. Complete solution(Oracle 18c) with table and Query can be found in DBFIDDLE URL https://dbfiddle.uk/?rdbms=oracle_18&fiddle=df73453d7fa4e0478e74fa509b20a411.
WITH some_data AS (
SELECT 'MATH' AS subj
FROM dual
UNION ALL
SELECT 'SCIENCE' AS subj
FROM dual
UNION ALL
SELECT 'HISTORY' AS subj
FROM dual
)
SELECT
nvl(subj,'TOTAL') "Subject",
nvl(SUM(cnt),0) "Count",
nvl(SUM(pct),0) AS "%"
FROM
(SELECT
dd,
COUNT(1) cnt,
round(RATIO_TO_REPORT(COUNT(1) ) OVER() * 100,2) AS pct
FROM
student p,
student_subject a
WHERE
p.sId = a.sId
AND student_type IN (
'1',
'2'
)
AND dd IN (
'MATH',
'SCIENCE',
'HISTORY'
)
GROUP BY
dd
ORDER BY
1
) tab, some_data
where tab.dd(+) = some_data.subj
GROUP BY
ROLLUP(subj)
ORDER BY
1;
You need to use the list of tables as the inner view and use left join as follows:
SELECT NVL(DD, 'TOTAL') "Subject",
SUM(CNT) "Count",
SUM(PCT) AS "%"
FROM (
SELECT DD,
COUNT(1) CNT,
ROUND(RATIO_TO_REPORT(COUNT(1)) OVER() * 100, 2) AS PCT
FROM (
SELECT 'MATH' AS SUB FROM DUAL UNION ALL
SELECT 'SCIENCE' AS SUB FROM DUAL UNION ALL
SELECT 'HISTORY' AS SUB FROM DUAL
) SUBJECTS
LEFT JOIN STUDENT_SUBJECT A
ON SUBJECTS.SUB = A.DD
LEFT JOIN STUDENT P
ON P.SID = A.SID
WHERE STUDENT_TYPE IN (
'1','2'
)
GROUP BY DD
ORDER BY 1
)
GROUP BY ROLLUP(DD)
ORDER BY 1;
You have to use case statement to make it 0, If any of the subject is null then default it to 0. Let me know if you require a query. Am suggesting the logic so that you can try yourself.
I am trying to perform a query like the following, with selecting by a case statement and grouping by the same case statement..
Select USER,
(CASE
WHEN value between 0 AND 2 then '0-2'
WHEN value between 3 AND 4 then '3-4'
ELSE '5+'
END) as CASE_STATEMENT ,
SUM(value)
.....
Group by user, CASE_STATEMENT
using JPA 2.0 Criteria API, with Hibernate.
My test case looks like ...
CriteriaBuilder cb = em.getCriteriaBuilder()
CriteriaQuery cq = cb.createQuery(Tuple)
def root = cq.from(TestEntity)
def userGet = root.get('user')
def valueGet = root.get('value')
def caseExpr =
cb.selectCase()
.when(cb.between(valueGet, 0, 2), '0-2')
.when(cb.between(valueGet, 3, 4), '3-4')
.otherwise('5+')
def sumExpr = cb.sum(valueGet)
cq.multiselect([userGet, caseExpr, sumExpr])
cq.groupBy([userGet, caseExpr])
log(typedQuery.unwrap(Query).queryString)
List<Tuple> tuples = typedQuery.getResultList()
The log statement of the queryString reads
SELECT generatedAlias0.USER,
CASE
WHEN generatedAlias0.value BETWEEN 0 AND 2 THEN Cast(:param0 AS STRING)
WHEN generatedAlias0.value BETWEEN 3 AND 4 THEN Cast(:param1 AS STRING)
ELSE Cast(:param2 AS STRING)
END,
Sum(generatedAlias0.value)
FROM test AS generatedAlias0
GROUP BY generatedAlias0.USER,
CASE
WHEN generatedAlias0.value BETWEEN 0 AND 2 THEN Cast(
:param3 AS STRING)
WHEN generatedAlias0.value BETWEEN 3 AND 4 THEN Cast(
:param4 AS STRING)
ELSE Cast(:param5 AS STRING)
END
When calling the typedQuery.getResultList(), I get the following error statement
javax.persistence.PersistenceException: org.hibernate.exception.GenericJDBCException: could not extract ResultSet
Caused by: org.h2.jdbc.JdbcSQLException: Column "TESTENTITY0_.VALUE" must be in the GROUP BY list; SQL statement:
select testentity0_.user as col_0_0_, case when testentity0_.value between 0 and 2 then cast(? as varchar(255)) when testentity0_.value between 3 and 4 then cast(? as varchar(255)) else cast(? as varchar(255)) end as col_1_0_, sum(testentity0_.value) as col_2_0_ from test testentity0_ group by testentity0_.user , case when testentity0_.value between 0 and 2 then cast(? as varchar(255)) when testentity0_.value between 3 and 4 then cast(? as varchar(255)) else cast(? as varchar(255)) end [90016-194]
Is there something wrong with the way I am trying to group by the Expression? I have also tried grouping by alias names, and by number literals (1, 2)
Is there another way I can go about structuring the SQL to get the same results?
Thanks.
As the exception message suggests, the problem is related to the Group By statement at DBMS level. See: https://www.percona.com/blog/2019/05/13/solve-query-failures-regarding-only_full_group_by-sql-mode/
To solve the error, you must either
Set the Group By Mode of the underlying DBMS to a less restrictive level (MySQL allows to disable only-full-group-by, but H2 does not (you may try setting MODE=MYSQL in jdbc connection string)
or (better)
Add all columns that are part of the select statement to the GROUP BY statement or to an aggregate function as described above.
You should be able to build a nested query which fulfills the GROUP BY RESTRICTIONS.
For the rescue, there are some (maybe DBMS specific) aggregate
functions (at least in MySQL). To trick JPA and Hibernate to understand these, there are
several ways to achieve this, as described at
https://vladmihalcea.com/hibernate-sql-function-jpql-criteria-api-query/
and
https://vladmihalcea.com/the-jpa-entitymanager-createnativequery-is-a-magic-wand/
Edit
In contrast and addition to the statement above, the findings after discussion below are:
The exception is raised by the h2 driver in the org.h2.expression.ExpressionColumn class, while it's verifying the query syntax
The solution requires setting and referencing an alias in the query (at the case statement or subquery), which is currently not possible in Criteria API (see column aliases usually can't be referenced in the query itself)
A workaround would be creating of a NativeQuery like this:
List<Tuple> tuples = em.createNativeQuery(
"SELECT generatedAlias0.USER, " +
" CASE " +
" WHEN generatedAlias0.value BETWEEN 0 AND 2 THEN Cast(:param0 AS VARCHAR) " +
" WHEN generatedAlias0.value BETWEEN 3 AND 4 THEN Cast(:param1 AS VARCHAR) " +
" ELSE Cast(:param2 AS VARCHAR) " +
" END c, " +
" Sum(generatedAlias0.value) as sumvalue " +
"FROM test AS generatedAlias0 " +
"GROUP BY generatedAlias0.USER, c "
)
.setParameter("param0", "0-2")
.setParameter("param1", "3-4")
.setParameter("param2", "5+")
.getResultList();
I have one birthdate field as date type if want to group all uses data by age range i have got follwing sql code to perform solution but i need criteria api query corresponding to this. How to write criteria query for the following code??
WITH AgeData as
(
SELECT [Username],
[Birthdate],
DATEDIFF(YEAR, [Birthdate], GETDATE()) AS [AGE]
FROM #table
),
GroupAge AS
(
SELECT [Username],
[Birthdate],
[Age],
CASE
WHEN AGE < 30 THEN 'Under 30'
WHEN AGE BETWEEN 31 AND 40 THEN '31 - 40'
WHEN AGE BETWEEN 41 AND 50 THEN '41 - 50'
WHEN AGE > 50 THEN 'Over 50'
ELSE 'Invalid Birthdate'
END AS [Age Groups]
FROM AgeData
)
SELECT COUNT(*) AS [AgeGrpCount],
[Age Groups]
FROM GroupAge
GROUP BY [Age Groups];
I'm using SQL Case in my select and in group by clause and I'm working in JAVA. Whenever I execute my java program it says:
Column 'dbo.JOHN_Dashboard.Log_Date' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
My Query is:
SELECT EP.Site_Code AS [Site_Code], DB.[Site] AS [Site], DB.[Utility] AS [Utility],
CASE ? WHEN 'Raw' THEN dateadd(mi,datediff(mi,0,DB.[log_date]),0)
WHEN 'Hour' THEN dateadd(hh,datediff(hh,0,DB.[log_date]),0)
WHEN 'Day' THEN dateadd(dd,datediff(dd,0,DB.[log_date]),0)
WHEN 'Week' THEN dateadd(wk,datediff(wk,0,DB.[log_date]),0)
WHEN 'Month' THEN dateadd(mm,datediff(mm,0,DB.[log_date]),0)
WHEN 'Year' THEN dateadd(yy,datediff(yy,0,DB.[log_date]),0)
ELSE DB.[log_date]
END AS [log_date],
SUM(CASE WHEN DB.[value] >= 0 THEN DB.[value] ELSE 0 END) AS [value],
SUM(CASE WHEN DB.[Cost] >=0 THEN DB.[cost] ELSE 0 END) AS [Cost],
SUM(CASE WHEN DB.[CO2] >=0 THEN DB.[CO2] ELSE 0 END) AS [CO],
MT.[Meter_type_name] AS [Meter Type],
MN.[Meter_Name] AS [Meter Name],
U.[Unit_Name] AS [Units],
EP.EnergyPoint_ID AS [Meter_ID],
EP.Parent_ID AS [Parent],
EP.Meter_Description AS [Meter_Description]
FROM [dbo].[JOHN_Dashboard] DB
INNER JOIN [dbo].[EnergyPoints] EP ON DB.[EnergyPoint_ID] = EP.[EnergyPoint_ID]
INNER JOIN [dbo].[Meter_Types] MT ON MT.[Meter_Type_ID] = EP.[Meter_Type_ID]
INNER JOIN [dbo].[Meter_Names] MN ON MN.[Meter_Name_ID] = EP.[Meter_Name_ID]
INNER JOIN [dbo].[Units] U ON U.[Unit_ID] = EP.[Unit_id]
WHERE [log_date] >= ? AND [Log_Date] < DATEADD(DAY, 1, ?)
AND ( ? IS NULL OR EP.Energypoint_ID = ?)
GROUP BY EP.Site_Code, DB.[Site], DB.[Utility], MT.[Meter_type_name],
MN.[Meter_Name], U.[Unit_Name], EP.[EnergyPoint_ID],
EP.[Parent_ID], EP.[Meter_Description],
CASE ? WHEN 'Raw' THEN dateadd(mi,datediff(mi,0,DB.[log_date]),0)
WHEN 'Hour' THEN dateadd(hh,datediff(hh,0,DB.[log_date]),0)
WHEN 'Day' THEN dateadd(dd,datediff(dd,0,DB.[log_date]),0)
WHEN 'Week' THEN dateadd(wk,datediff(wk,0,DB.[log_date]),0)
WHEN 'Month' THEN dateadd(mm,datediff(mm,0,DB.[log_date]),0)
WHEN 'Year' THEN dateadd(yy,datediff(yy,0,DB.[log_date]),0)
ELSE DB.[log_date] END ;
The parameters i'm passing are:
'Week'
'2016-05-16'
'2016-05-22'
6044
6044
'Week'
Note: This query runs without error in SQL Management Studio.
As requested here is a reworked version of your code using a sub-query before grouping. Since I don't have your database I can't guarantee that I have everything exactly right but give this a try.
I recommend always using a sub-query when your group by has complicated logic that will be repeated in the select. Some people would probably drop the second criteria and just say whenever the group by has complicated logic.
SELECT sub.Site_Code, sub.[Site], sub.[Utility], sub.[Meter Type],
sub.[log_date],
SUM(sub.[value]) as [value],
SUM(sub.[Cost]) as [cost],
SUM(sub.[CO]) as [CO],
sub.[Meter Name], sub.[Units], sub.[Meter_ID],
sub.[Parent], sub.[Meter_Description]
FROM (
SELECT EP.Site_Code AS [Site_Code], DB.[Site] AS [Site], DB.[Utility] AS [Utility],
CASE ? WHEN 'Raw' THEN dateadd(mi,datediff(mi,0,DB.[log_date]),0)
WHEN 'Hour' THEN dateadd(hh,datediff(hh,0,DB.[log_date]),0)
WHEN 'Day' THEN dateadd(dd,datediff(dd,0,DB.[log_date]),0)
WHEN 'Week' THEN dateadd(wk,datediff(wk,0,DB.[log_date]),0)
WHEN 'Month' THEN dateadd(mm,datediff(mm,0,DB.[log_date]),0)
WHEN 'Year' THEN dateadd(yy,datediff(yy,0,DB.[log_date]),0)
ELSE DB.[log_date]
END AS [log_date],
CASE WHEN DB.[value] >= 0 THEN DB.[value] ELSE 0 END AS [value],
CASE WHEN DB.[Cost] >=0 THEN DB.[cost] ELSE 0 END AS [Cost],
CASE WHEN DB.[CO2] >=0 THEN DB.[CO2] ELSE 0 END AS [CO],
MT.[Meter_type_name] AS [Meter Type],
MN.[Meter_Name] AS [Meter Name],
U.[Unit_Name] AS [Units],
EP.EnergyPoint_ID AS [Meter_ID],
EP.Parent_ID AS [Parent],
EP.Meter_Description AS [Meter_Description]
FROM [dbo].[JOHN_Dashboard] DB
INNER JOIN [dbo].[EnergyPoints] EP ON DB.[EnergyPoint_ID] = EP.[EnergyPoint_ID]
INNER JOIN [dbo].[Meter_Types] MT ON MT.[Meter_Type_ID] = EP.[Meter_Type_ID]
INNER JOIN [dbo].[Meter_Names] MN ON MN.[Meter_Name_ID] = EP.[Meter_Name_ID]
INNER JOIN [dbo].[Units] U ON U.[Unit_ID] = EP.[Unit_id]
WHERE [log_date] >= ? AND [Log_Date] < DATEADD(DAY, 1, ?)
AND ( ? IS NULL OR EP.Energypoint_ID = ?)
) sub
GROUP BY sub.Site_Code, sub.[Site], sub.[Utility], sub.[Meter Type],
sub.[Meter Name], sub.[Units], sub.[Meter_ID],
sub.[Parent], sub.[Meter_Description], sub.[log_date];