Cassandra timestamp is not accurate? - java

I am using cassandra-maven-plugin with Maven failsafe to perform TestNG integration test for a class that handles Cassandra-related operations.
While trying to find why my queries for a column of type timestamp always fail; I noticed something weird. At first, dates that I am using as a parameter and the date retrieved from Cassandra looked like the same, I was formatting them as a String to compare.
But when I compared them in milliseconds, I noticed that Cassandra date is ~1000 milliseconds ahead of the Date I use for query.
Then I directly used milliseconds to set the row's timestamp value, yet Cassandra returned a date with some extra milliseconds again.
Question is, is this a known bug?
Now I am going to use a String or Long representation of date to fix this; but I want to know what is going on?
Thanks!
Edit: I am running tests using Windows 8.1.
Here's my (somewhat sanitized) data set file that is loaded to the embedded Cassandra:
CREATE KEYSPACE IF NOT EXISTS analytics WITH replication = { 'class': 'SimpleStrategy', 'replication_factor': 1 };
CREATE TABLE IF NOT EXISTS analytics.events (id text, name text, start_time timestamp, end_time timestamp, parameters map<text,text>, PRIMARY KEY (id));
CREATE TABLE IF NOT EXISTS analytics.errors (id text, message text, time timestamp, parameters map<text,text>, PRIMARY KEY (id, time));
CREATE TABLE IF NOT EXISTS analytics.reports (name text, time timestamp, period int, data blob, PRIMARY KEY (name, period));
CREATE TABLE IF NOT EXISTS analytics.user_preferences (company_id text, user_principal text, opted_in boolean, PRIMARY KEY (company_id, user_principal));
CREATE INDEX IF NOT EXISTS event_name ON analytics.events (name);
CREATE INDEX IF NOT EXISTS report_time ON analytics.reports (time);
INSERT INTO analytics.user_preferences (company_id, user_principal, opted_in) VALUES ('company1', 'user1', true);
INSERT INTO analytics.user_preferences (company_id, user_principal, opted_in) VALUES ('company3', 'user3', false);
INSERT INTO analytics.reports (name, time, period, data) VALUES ('com.example.SomeReport1', null, 3, null);
INSERT INTO analytics.reports (name, time, period, data) VALUES ('com.example.SomeReport2', 1296691200000, 1, null);
Here's my query:
SELECT * FROM analytics.reports WHERE name = ? AND period = ? AND time = ? LIMIT 1;
When I get the time field with row.getDate("time") it has those extra milliseconds.

Related

Best way to check database and execute operations every minute?

I have an eCommerce app. I have an Item entity and whenever that item's end date time is equal to current time, the Item's status should change ( I also need to execute other SQL operations such as inserting a row to a table)
Basically, I want to execute an SQL operation that checks the database and changes entities every minute.
I have a few ideas on how to implement this:
Schedule a job in my linux server that checks the db every minute
Use sp_executesql (Transact-SQL) or DBMS Scheduler
Have a thread running in my Java backend to check db and execute operations.
I am very new to this, so I don't have any idea how to implement this. What is the most efficient implementation that takes into account scalability performance?
Other information: database is SQL Server, server is Linux, backend is Java Spring Boot.
If you need to run a script after an insert or update, you can consolidate all that complex logic (e.g. insert rows in other tables, update the status column, etc.) in a trigger:
Here's a sample table schema:
CREATE TABLE t1 (id INT IDENTITY(1,1), start_time DATETIME, end_time DATETIME,
status VARCHAR(25))
And a sample insert/update trigger for that table:
CREATE TRIGGER u_t1
ON t1
AFTER INSERT,UPDATE
AS
BEGIN
UPDATE t1
SET status = CASE WHEN inserted.end_time = inserted.start_time
THEN 'same' ELSE 'different' END
FROM t1
INNER JOIN inserted ON t1.id = inserted.id
-- do anything else you want!
-- e.g.
-- INSERT INTO t2 (id, status) SELECT id, status FROM inserted
END
GO
Insert a couple test records:
INSERT INTO t1 (start_time, end_time)
VALUES
(GETDATE(), GETDATE() - 1), -- different
(GETDATE(), GETDATE()) -- same
Query the table after the inserts:
SELECT * FROM t1
See that the status is calculated correctly:
id start_time end_time status
1 2018-07-17 02:53:24.577 2018-07-16 02:53:24.577 different
2 2018-07-17 02:53:24.577 2018-07-17 02:53:24.577 same
If your only goal is to update the status column based on other values in the table, then a computed column is the simplest approach; you just supply the formula:
create table t1 (id int identity(1,1), start_time datetime, end_time datetime,
status as
case
when start_time is null then 'start null'
when end_time is null then 'end null'
when start_time < end_time then 'start less'
when end_time < start_time then 'end less'
when start_time = end_time then 'same'
else 'what?'
end
)

SQLite JDBC - How to populate new Database

I was given two databases, and am supposed to create a table in a new database that stores information about the given databases.
So far I created a table in a new database. I also defined its attributes, but I am stuck on how to populate this table.
The attributes are things like 'original_db, 'original_field' etc , but I don't know how to access this information? Especially since I would need to connect jdbc to 3 databases (the new one, and the 2 old ones) at the same time. Is ths even possible?
I am new to working on databases and SQLite, so sorry if this is a stupid problem.
I would be so grateful for any advice!
What I can grasp from your question is you don't know how to insert data into it?
You could do this in your schema.sql when creating the db e.g the schema would look something like.
DROP TABLE users;
CREATE TABLE users(id integer primary key, username varchar(30), password varchar(30), email varchar(50), receive_email blob, gender varchar(6));
DROP TABLE lists;
CREATE TABLE lists(id integer primary key, user_id integer, list_id integer, name varchar(50), creation_date datetime, importance integer, completed blob);
DROP TABLE list_item;
CREATE TABLE list_item(id integer primary key, list_id integer, name varchar(50), creation_date datetime, completed blob);
then you could make a data.sql file or something with
INESRT into users VALUES(NULL, "name", ...);
and then do in the terminal
sqlite3 database.db <- data.sql
or you could start a sqlite interactive session in the same working directory as your db and manually type in the code
e.g type in
sqlite3 databasename.db
and then type in commands from there

Not able to Sort Correctly for some reason using EPOCH in SQLite

Here's my Create Table :
CREATE TABLE TEMPSORT_TABLE
(SortRowID INTEGER PRIMARY KEY,
SortTitle TEXT NOT NULL,
SortPrice INTEGER NOT NULL,
SortDateTime datetime default (('2000-01-01')),
SortTrueDateTime INTEGER NOT NULL)
I have created a simple Table and I have a column called "SortTrueDateTime" is an INTEGER column and it stores a successfully converted Date that was converted to EPOCH time. Now for SOME unknown reason when I do a simple..
SELECT SortTrueDateTime FROM TEMPSORT_TABLE ORDER BY SortTrueDateTime DESC
It isn't actually sorting correctly by the SortTrueDateTime column... they are just still random records...
ANY one have this crazy issue or able to assist me in any way to try and accomplish my goal?
EXAMPLE output..
1418878800000
1388638800000
1419224400000
1388638800000
1419224400000

how to get the timestamp taken to insert the data into a table

I am storing images in a table i would like to get the time stamp taken to insert the image into the table are their any inbound sql syntax for getting the timestamp of inserted data ? or any java syntax for getting the timestamp of inserted data from the database kindly help out
CURRENT_TIMESTAMP is a function that is available in most database servers. So a sql like this should work.
INSERT INTO <some_table> (<timestamp_column_name> VALUES (CURRENT_TIMESTAMP)
If you want to do it in java then you can do the following:-
java.util.Date currentTime = new java.util.Date();
System.out.println(new java.sql.Timestamp(currentTime.getTime())); //gets you the current sql timestamp
It is going to vary by database. This is going to work in both least Oracle and mysql:
insert into PRODUCE (color, weight, added_at) values ('red', '8 ounces', sysdate());
If you are using MySql, you can also add DEFAULT CURRENT TIMESTAMP to your column definition:
create table PRODUCE (
color varchar(20),
weight varchar(20),
added_at TIMESTAMP DEFAULT CURRENT TIMESTAMP
);
insert into PRODUCE (color, weight) values ('red', '8 ounces');
In Oracle, a separate trigger would need to be created to achieve the default value.

How to programmatically transfer a lot of data between tables?

i have two tables where in the first one i have 14 millions and in the second one i have 1.5 million of data.
So i wonder how could i transfer this data to another table to be normalized ?
And how do i convert some type to another, for example: i have a field called 'year' but its type is varchar, but i want it an integer instead, how do i do that ?
I thought about do this using JDBC in a loop while from java, but i think this is not effeciently.
// 1.5 million of data
CREATE TABLE dbo.directorsmovies
(
movieid INT NULL,
directorid INT NULL,
dname VARCHAR (500) NULL,
addition VARCHAR (1000) NULL
)
//14 million of data
CREATE TABLE dbo.movies
(
movieid VARCHAR (20) NULL,
title VARCHAR (400) NULL,
mvyear VARCHAR (100) NULL,
actorid VARCHAR (20) NULL,
actorname VARCHAR (250) NULL,
sex CHAR (1) NULL,
as_character VARCHAR (1500) NULL,
languages VARCHAR (1500) NULL,
genres VARCHAR (100) NULL
)
And this is my new tables:
DROP TABLE actor
CREATE TABLE actor (
id INT PRIMARY KEY IDENTITY,
name VARCHAR(200) NOT NULL,
sex VARCHAR(1) NOT NULL
)
DROP TABLE actor_character
CREATE TABLE actor_character(
id INT PRIMARY KEY IDENTITY,
character VARCHAR(100)
)
DROP TABLE director
CREATE TABLE director(
id INT PRIMARY KEY IDENTITY,
name VARCHAR(200) NOT NULL,
addition VARCHAR(150)
)
DROP TABLE movie
CREATE TABLE movie(
id INT PRIMARY KEY IDENTITY,
title VARCHAR(200) NOT NULL,
year INT
)
DROP TABLE language
CREATE TABLE language(
id INT PRIMARY KEY IDENTITY,
language VARCHAR (100) NOT NULL
)
DROP TABLE genre
CREATE TABLE genre(
id INT PRIMARY KEY IDENTITY,
genre VARCHAR(100) NOT NULL
)
DROP TABLE director_movie
CREATE TABLE director_movie(
idDirector INT,
idMovie INT,
CONSTRAINT fk_director_movie_1 FOREIGN KEY (idDirector) REFERENCES director(id),
CONSTRAINT fk_director_movie_2 FOREIGN KEY (idMovie) REFERENCES movie(id),
CONSTRAINT pk_director_movie PRIMARY KEY(idDirector,idMovie)
)
DROP TABLE genre_movie
CREATE TABLE genre_movie(
idGenre INT,
idMovie INT,
CONSTRAINT fk_genre_movie_1 FOREIGN KEY (idMovie) REFERENCES movie(id),
CONSTRAINT fk_genre_movie_2 FOREIGN KEY (idGenre) REFERENCES genre(id),
CONSTRAINT pk_genre_movie PRIMARY KEY (idMovie, idGenre)
)
DROP TABLE language_movie
CREATE TABLE language_movie(
idLanguage INT,
idMovie INT,
CONSTRAINT fk_language_movie_1 FOREIGN KEY (idLanguage) REFERENCES language(id),
CONSTRAINT fk_language_movie_2 FOREIGN KEY (idMovie) REFERENCES movie(id),
CONSTRAINT pk_language_movie PRIMARY KEY (idLanguage, idMovie)
)
DROP TABLE movie_actor
CREATE TABLE movie_actor(
idMovie INT,
idActor INT,
CONSTRAINT fk_movie_actor_1 FOREIGN KEY (idMovie) REFERENCES movie(id),
CONSTRAINT fk_movie_actor_2 FOREIGN KEY (idActor) REFERENCES actor(id),
CONSTRAINT pk_movie_actor PRIMARY KEY (idMovie,idActor)
)
UPDATE:
I'm using SQL Server 2008.
Sorry guys i forgot to mention that are different databases :
The not normalized is call disciplinedb and the my normalized call imdb.
Best regards,
Valter Henrique.
If both tables are in the same database, then the most efficient transfer is to do it all within the database, preferably by sending a SQL statement to be executed there.
Any movement of data from the d/b server to somewhere else and then back to the d/b server is to be avoided unless there is a reason it can only be transformed off-server. If the destination is different server, then this is much less of an issue.
Though my tables were dwarfs compared to yours, I got over this kind of problem once with stored procedures. For MySQL, below is a simplified (and untested) essence of my script, but something similar should work with all major SQL bases.
First you should just add a new integer year column (int_year in example) and then iterate over all rows using the procedure below:
DROP PROCEDURE IF EXISTS move_data;
CREATE PROCEDURE move_data()
BEGIN
DECLARE done INT DEFAULT 0;
DECLARE orig_id INT DEFAULT 0;
DECLARE orig_year VARCHAR DEFAULT "";
DECLARE cur1 CURSOR FOR SELECT id, year FROM table1;
DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = 1;
OPEN cur1;
PREPARE stmt FROM "UPDATE table1 SET int_year = ? WHERE id = ?";
read_loop: LOOP
FETCH cur1 INTO orig_id, orig_year;
IF done THEN
LEAVE read_loop;
END IF;
SET #year= orig_year;
SET #id = orig_id;
EXECUTE stmt USING #orig_year, #id;
END LOOP;
CLOSE cur1;
END;
And to start the procedure, just CALL move_data().
The above SQL has two major ideas to speed it up:
Use CURSORS to iterate over a large table
Use PREPARED statement to quickly execute pre-known commands
PS. for my case this speeded things up from ages to seconds, though in your case it can still take a considerable amount of time. So it would be probably best to execute from command line, not some web interface (e.g. PhpMyAdmin).
I just recently did this for ~150 Gb of data. I used a pair of merge statements for each table. The first merge statement said "if it's not in the destination table, copy it there" and the second said "if it's in the destination table, delete it from the source". I put both in a while loop and only did 10000 rows in each operation at a time. Keeping it on the server (and not transferring it through a client) is going to be a huge boon for performance. Give it a shot!

Categories