I want to retrieve a record which has a date field whose value is closer to a given date.How should I proceed?
Below is the table,
id |employeeid|region |startdate |enddate |
1 1234 abc 2014-11-24 2015-01-17
2 1234 xyz 2015-01-18 9999-12-31
Here, I should retrieve the record whose enddate is closer to the startdate of another record say,'2015-01-18', so it should retrieve the 1 st record.I tried the following queries
1.
SELECT l.region
FROM ABC.location l where l.EmployeeId=1234
ORDER BY ABS( DATEDIFF('2015-01-18',l.Enddate) );
2.
SELECT l.region
FROM ABC.location l where l.EmployeeId=1234
ORDER BY ABS( DATEDIFF(l.Enddate,'2015-01-18') );
But, none of them is working. Kindly help me in this.
Thanks,
Poorna.
You might want to try this:
Query query = session.createQuery("SELECT l.region, ABS( DATEDIFF('2015-01-18',l.Enddate) ) as resultDiff FROM ABC.location l where l.EmployeeId=1234 ORDER BY resultDiff");
query.setFirstResult(0);
query.setMaxResults(1);
List result = query.list();
Well, Unix timestamps are expressed as a number of seconds since 01 Jan 1970, so if you subtract one from the other you get the difference in seconds. The difference in days is then simply a matter of dividing by the number of seconds in a day:
(date_modified - date_submitted) / (24*60*60)
or
(date_modified - date_submitted) / 86400
get the minimum of them.
refer this question it may be helpful::::Selecting the minimum difference between two dates in Oracle when the dates are represented as UNIX timestamps
Related
I am trying to do a windowed aggregation query on a data stream that contains over 40 attributes in Flink. The stream's schema contains an epoch timestamp which I want to use for the WatermarkStrategy so I can actually define tumbling windows over it.
I know from the docs, that you can define a Timestamp using the SQL Api in an CREATE TABLE-query by first using TO_TIMESTAMP_LTZ on the epochs to convert it to a proper timestamp which can be used in the following WATERMARK FOR-statement. Since I got a really huge schema tho, I want to deserialise and provide the schema NOT by completely writing the complete CREATE TABLE-statement containing all columns BUT by using a custom class derived from the proto file that cointains the schema. As far as I know, this is only possible by providing a deserializer for the KafkaSourceBuilder and calling the results function of the stream on the class derived from the protofile with protoc. Which means, that I have to define the table using the stream api.
Inspired by the answer to this question, I do it like this:
WatermarkStrategy watermarkStrategy = WatermarkStrategy
.<Row>forBoundedOutOfOrderness(Duration.ofSeconds(10))
.withTimestampAssigner( (event, ts) -> (Long) event.getField("ts"));
tableEnv.createTemporaryView(
"bidevents",
stream
.returns(BiddingEvent.BidEvent.class)
.map(e -> Row.of(
e.getTracking().getCampaign().getId(),
e.getTracking().getAuction().getId(),
Timestamp.from(Instant.ofEpochSecond(e.getTimestamp().getMilliseconds() / 1000))
)
)
.returns(Types.ROW_NAMED(new String[] {"campaign_id", "auction_id", "ts"}, Types.STRING, Types.STRING, Types.SQL_TIMESTAMP))
.assignTimestampsAndWatermarks(watermarkStrategy)
);
tableEnv.executeSql("DESCRIBE bidevents").print();
Table resultTable = tableEnv.sqlQuery("" +
"SELECT " +
" TUMBLE_START(ts, INTERVAL '1' DAY) AS window_start, " +
" TUMBLE_END(ts, INTERVAL '1' DAY) AS window_end, " +
" campaign_id, " +
" count(distinct auction_id) auctions " +
"FROM bidevents " +
"GROUP BY TUMBLE(ts, INTERVAL '1' DAY), campaign_id");
DataStream<Row> resultStream = tableEnv.toDataStream(resultTable);
resultStream.print();
env.execute();
I get this error:
Caused by: org.apache.flink.client.program.ProgramInvocationException: The main method caused an error: Window aggregate can only be defined over a time attribute column, but TIMESTAMP(9) encountered.
at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:372) ~[flink-dist-1.15.1.jar:1.15.1]
at org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:222) ~[flink-dist-1.15.1.jar:1.15.1]
at org.apache.flink.client.ClientUtils.executeProgram(ClientUtils.java:114) ~[flink-dist-1.15.1.jar:1.15.1]
at org.apache.flink.client.deployment.application.ApplicationDispatcherBootstrap.runApplicationEntryPoint(ApplicationDispatcherBootstrap.java:291) ~[flink-dist-1.15.1.jar:1.15.1]
This seems kind of logical, since in line 3 I cast a java.sql.Timestamp to a Long value, which it is not (but also the stacktrace does not indicate that an error occurred during the cast). But when I do not convert the epoch (in Long-Format) during the map-statement to a Timestamp, I get this exception:
"Cannot apply '$TUMBLE' to arguments of type '$TUMBLE(<BIGINT>, <INTERVAL DAY>)'"
How can I assign the watermark AFTER the map-statement and use the column in the later SQL Query to create a tumbling window?
======UPDATE=====
Thanks to a comment from David, I understand that I need the column to be of type TIMESTAMP(p) with precision p <= 3. To my understanding this means, that my timestamp may not be more precise than having full milliseconds. So i tried different ways to create Java Timestamps (java.sql.Timestamps and java.time.LocaleDateTime) which correspond to the Flink timestamps.
Some examples are:
1 Trying to convert epochs into a LocalDateTime by setting nanoseconds (the 2nd parameter of ofEpochSecond to 0):
LocalDateTime.ofEpochSecond(e.getTimestamp().getMilliseconds() / 1000, 0, ZoneOffset.UTC )
2 After reading the answer from Svend in this question who uses LocalDateTime.parse on timestamps that look like this "2021-11-16T08:19:30.123", I tried this:
LocalDateTime.parse(
DateTimeFormatter.ofPattern("yyyy-MM-dd'T'HH:mm:ss").format(
LocalDateTime.ofInstant(
Instant.ofEpochSecond(e.getTimestamp().getMilliseconds() / 1000),
ZoneId.systemDefault()
)
)
)
As you can see, the timestamps even only have seconds-granularity (which i checked when looking at the printed output of the stream I created) which I assume should mean, they have a precision of 0. But actually when using this stream for defining a table/view, it once again has the type TIMESTAMP(9).
3 I also tried it with the sql timestamps:
new Timestamp(e.getTimestamp().getMilliseconds() )
This also did not change anything. I somehow always end up with a precision of 9.
Can somebody please help me how I can fix this?
Ok, I found the solution to the problem. If you got a stream containing a timestamp which you want to define as event time column for watermarks, you can use this function:
Table inputTable = tableEnv.fromDataStream(
stream,
Schema.newBuilder()
.column("campaign_id", "STRING")
.column("auction_id", "STRING")
.column("ts", "TIMESTAMP(3)")
.watermark("ts", "SOURCE_WATERMARK()")
.build()
);
The important part is, that you can "cast" the timestamp ts from TIMESTAMP(9) "down" to TIMESTAMP(3) or any other precision below 4 and you can set the column to contain the water mark.
Another mention that seems important to me is, that only Timestamps of type java.time.LocalDateTime actually worked for later use as watermarks for tumbling windows.
Any other attempts to influence the precision of the timestamps by differently creating java.sql.Timestamp or java.time.LocalDateTime failed. This seemed to be the only viable way.
I am working on an android app. It has a database that stores some events(say, when the user presses a button, that special event is stored in the database) and timestamp. I want to know if the user has pressed that button daily at least once for 5 consecutive days.
I came across this stackoverflow answer that tells me to use recursion of SQLite. I tried to use the SQLite Recursion to 'loop for 5 days' but I am getting an error. I don't understand what I am doing wrong. Help.
Here is my code:
WITH RECURSIVE
recursiveDailyEvents (startTimeMillis, endTimeMillis, eventCount) AS
(
1612244382565, 1612330782565, (select unique count from event_tracking_table where event_tracking_table_event_id = 'post_image' and 1612244382565 <= event_tracking_table_timestamp and event_tracking_table_timestamp <= 1612330782565 )
UNION ALL startTimeMillis + 86400000, endTimeMillis + 86400000 FROM recursiveDailyEvents
Limit 5)
select * from recursiveDailyEvents;
);
This is the error from sqlite browser:
near "1612244382565": syntax error: WITH RECURSIVE
recursiveDailyEvents (startTimeMillis, endTimeMillis, eventCount) AS
(
1612244382565
But I was expecting a table with startTimeMillis, endTimeMillis, and a count (1 or 0).
What am I doing wrong? Or how should I write this recursion?
Edit
Here is some sample data
event_tracking_table_row_id| event_tracking_table_event_id| event_tracking_table_timestamp
1|app_open|1612169104224
2|post_image|1612169437373
3|post_image|1612169738068
4|app_open|1612170216320
5|post_image|1612170507935
6|app_open|1612689116738
7|post_image|1612689316673
8|post_video|1612689579697
9|post_video|1612689609683
10|app_open|1612689664683
... ... ...
Here, event_tracking_table_event_id is in millisecond.
Expected output
If I understand correctly, the recursion should generate a table of start time millisecond, end time millisecond, and eventCount between those time limits.
So today (2 February 20201) at some time, the epoch time was 1612244382565, and after 24 hours, the end time is 1612330782565, and so on.
1612244382565 , 1612330782565, 1 // 1st day start time, end time, event count
1612330782565 , 1612417182565, 0 // 2nd day start time, end time, event count
... ... // 5 rows for 5 consecutive days.
I am trying my best to be as clear as possible.
If you want the starting and ending time of each day and whether the button was clicked, you can do it with conditional aggregation:
SELECT date(event_tracking_table_timestamp / 1000, 'unixepoch', 'localtime') day,
MIN(event_tracking_table_timestamp) min_time,
MAX(event_tracking_table_timestamp) max_time,
MAX(event_tracking_table_event_id = 'post_image') event,
FROM event_tracking_table
GROUP BY day
If you want the number of times the button was clicked for each day:
SELECT date(event_tracking_table_timestamp / 1000, 'unixepoch', 'localtime') day,
MIN(event_tracking_table_timestamp) min_time,
MAX(event_tracking_table_timestamp) max_time,
SUM(event_tracking_table_event_id = 'post_image') event_count
FROM event_tracking_table
GROUP BY day
If you want the rows for the last 5 days, add a WHERE clause before GROUP BY day:
WHERE date(event_tracking_table_timestamp / 1000, 'unixepoch', 'localtime') >=
date('now', '-5 days', 'localtime')
See the demo.
I have database with 300 000 rows, and I need filter some rows by algorithm.
protected boolean validateMatch(DbMatch m) throws MatchException, NotSupportedSportException{
// expensive part
List<DbMatch> hh = sd.getMatches(DateService.beforeDay(m.getStart()), m.getHt(), m.getCountry(),m.getSportID());
List<DbMatch> ah = sd.getMatches(DateService.beforeDay(m.getStart()), m.getAt(), m.getCountry(),m.getSportID());
....
My hibernate dao function for load data from Mysql is used 2x times of init array size.
public List<DbMatch> getMatches(Date before,String team, String country,int sportID) throws NotSupportedSportException{
//Match_soccer where date between :start and :end
Criteria criteria = session.createCriteria(DbMatch.class);
criteria.add(Restrictions.le("start",before));
criteria.add(Restrictions.disjunction()
.add(Restrictions.eq("ht", team))
.add(Restrictions.eq("at", team)));
criteria.add(Restrictions.eq("country",country));
criteria.add(Restrictions.eq("sportID",sportID));
criteria.addOrder(Order.desc("start") );
return criteria.list();
}
Example how i try filter data
function List<DbMatch> filter(List<DbMatch> mSet){
List<DbMatch> filtred = new ArrayList<>();
for(DbMatch m:mSet){
if(validateMatch(DbMatch m))filtred.add(m);
}
}
(1)I tried different criteria settings and counted function times with stopwatch. My result is when I use filter(matches) matches size 1000 my program take 3 min 21 s 659 ms.
(2)I tried remove criteria.addOrder(Order.desc("start")); than program filtered after 3 min 12 s 811 ms.
(3)But if I remove criteria.addOrder(Order.desc("start")); and add criteria.setMaxResults(1); result was 22 s 311 ms.
Using last configs i can filter all my 300 000 record by 22,3 * 300 = 22300 s (~ 6h), but if use first function I should wait (~ 60 h).
If I want use criteria without order and limit i must be sure that my table is sorted by date on database because it is important get last match .
All data is stored on matches table.
Table indexes:
Table, Non_unique, Key_name, Seq_in_index, Column_name, Collation, Cardinality, Sub_part, Packed, Null, Index_type, Comment, Index_comment
matches, 0, PRIMARY, 1, mid, A, 220712, , , , BTREE, ,
matches, 0, UK_kcenwf4m58fssuccpknl1v25v, 1, beid, A, 220712, , , YES, BTREE, ,
UPDATED
After added ALTER TABLE matches ADD INDEX (sportID, country); now program time deacrised to 15s for 1000 matches. But if I not use order by and add limit need wait only 4s for 1000 mathces.
How I should act on this situation to improve program executions speed?
Your first order of business is to figure out how long each component take to process the request.
Find out the SQL query generated by the ORM and run that manually in MySQL workbench and see how long it takes (non cached). You can also ask for it to explain the index usage.
If it's fast enough then it's your java code that's taking longer and you need to optimize your algorithm. You can use JConsole to dig further into that.
If you identify which component is taking longer you can post here with your analysis and we can make suggestions accordingly.
I need to count the number of days between 2 dates in JPA.
For example :
CriteriaBuilder.construct(
MyCustomBean.class
myBean.get(MyBean_.beginDate), //Expression<Date>
myBean.get(MyBean_.endDate), //Expression<Date>
myDiffExpr(myBean) //How to write this expression from the 2 Expression<Date>?
);
So far, I tried :
CriteriaBuilder.diff(). but it does not compile because this method expects some N extends Number and the Date does not extend Number.
I tried to extend the PostgreSQL82Dialect (as my target database is PostgreSQL) :
public class MyDialect extends PostgreSQL82Dialect {
public MyDialect() {
super();
registerFunction("datediff",
//In PostgreSQL, date2 - date1 returns the number of days between them.
new SQLFunctionTemplate(StandardBasicTypes.LONG, " (?2 - ?1) "));
}
}
This compiles and the request succeeds but the returned result is not consistent (78 days between today and tomorrow).
How would you do this?
It looks like you are looking for a solution with JPQL to perform queries like SELECT p FROM Period p WHERE datediff(p.to, p.from) > 10.
I'm afraid there is no such functionality in JPQL so I recommend using native SQL. Your idea if extending Dialect with Hibernate's SQLFunctionTemplate was very clever. I'd rather change it to use DATE_PART('day', end - start) as this is the way to achieve days difference between dates with PostgreSQL.
You might also define your function in PostgreSQL and using it with criteria function().
'CREATE OR REPLACE FUNCTION "datediff"(TIMESTAMP,TIMESTAMP) RETURNS integer AS \'DATE_PART('day', $1 - $2);\' LANGUAGE sql;'
cb.function("datediff", Integer.class, end, start);
JPA 2.1 provides for use of "FUNCTION(funcName, args)" in JPQL statements. That allows such handling.
I finally found that the problem comes from the fact that the order of the parameters is not the one I expected :
/*
*(?2 - ?1) is actually equivalent to (? - ?).
* Hence, when I expect it to evaluate (date2 - date1),
* it will actually be evaluated to (date1 - date2)
*/
new SQLFunctionTemplate(StandardBasicTypes.LONG, " (?2 - ?1) "));
I opened a new question in order to know if this behavior is a bug or a feature :
1) CriteriaBuilder.diff(). but it does not compile because this method expects some N extends Number and the Date does not extend Number.
Try to use no of mili seconds for each date as shown below.
Date date = new Date()//use your required date
long millisecond = date.getTime();//Returns no of mili seconds from 1 Jan, 1970 GMT
Long in Number in java and according to autoboxing you can use this. May be this can help.
I have a table "Holidays" in my database, that contains a range of days when it's the holidays, defined with two column: start & end.
Holidays (id, name, start, end)
Now, if I have in input, two dates (from & to), I'd like to list all the dates that are not in the holidays.
Suppose the holidays are from 2012/06/05 to 2012/06/20, and I request:
from=2012/06/01, to=2012/06/10 ; The result would be 01, 02, 03, 04
from=2012/06/01, to=2012/06/22 ; The result would be 01, 02, 03, 04, 21, 22
from=2012/06/15, to=2012/06/22 ; The result would be 21, 22
But I can't figure out how to get this list of "opened" days without hitting the database for every days requested in the range from->to.
How could I do that?
There are many solutions, but it pretty much depends on how many entries you have in the database and how many requests you do. If you are making a lot of request, than you can do some thing like this:
-> create a boolean array that will determine if a day is holiday or not;
first element points to some predefined date (e.g. 1.1.2012),
second element to 2.1.2012, etc.
-> initialize an array to 0
-> for each holiday you do
-> make a for loop initialized with holiday start date and
expression for each pass: current date = holiday start date + 1 day
-> covert the current date to index (number of days since start date - 1.1.2012)
-> set the array[index] to 1
Now you should have a simple array containing 0 for non-holiday day and 1 for holiday day
for each query (request) you now do
-> for loop that goes from request start date to request end date
-> convert the current date to index (number of days since 1.1.2012)
-> check if array[index] is 0 or 1
But keep in mid, that this solution is ok for many query (requests). If you have to do the first part for every request, than this solution does not make sense and it is better to write a sql query.
Here's how I finally done that, it seems to work :
SELECT start, end FROM holidays WHERE
(start > :START AND end < :END) OR
(start < :START AND end > :END) OR
(start BETWEEN :START AND :END) OR
(end BETWEEN :START AND :END);
This returns only the rows where my :START/:END dates touch at least one holiday. It covers those possibilities :
start is before the begin of an holiday, and end if before the end of an holiday (before, in)
start is before the begin of an holiday, and end if after the end of an holiday (before,after)
start is after the begin of an holiday, and end if before the end of an holiday (in, in)
start is after the begin of an holiday, and end if after the end of an holiday (in, after)
I think I cover all the possibilities with that.
Then I loop over the result and build an array of dates that goes from start to end, for each rows.
And I finally loop over my initial range date, if one of those date is in the array, I remove them.
Here's a solution that can give you the solution in a single (albeit slightly convoluted) SQL statement (this is Oracle):
with all_days as (
select :start_date + (level - 1) dt
from dual
connect by :start_date + (level - 1) <= :end_date
)
select a.dt
from all_days a
where not exists (
select 1
from holidays h
where h.start_dt <= a.dt and h.end_dt >= a.dt
)
order by a.dt
For example, assuming the following holiday table:
NAME START_DT END_DT
-------------- ------------------------- -------------------------
Test Holiday 1 07-JUN-12 13-JUN-12
Test Holiday 2 17-JUN-12 18-JUN-12
And using 5th June as :start_date and 20th June as :end_date, you'd get the following output:
DT
-------------------------
05-JUN-12
06-JUN-12
14-JUN-12
15-JUN-12
16-JUN-12
19-JUN-12
20-JUN-12
(which provides the dates in a range minus any dates specified in a range in the holiday table).
Hope that helps.