postgresql thread safety for temporary tables

postgresql thread safety for temporary tables - java

This the syntax I use for creating a temporary table:
create temp table tmpTable (id bigint not null, primary key (id)) on commit drop;
I know this means that at the end of each transaction, this table will be dropped.
My question is, if two or more threads on the same session create and insert values into a temporary table, will they each get their own instance or is the temporary instance shared across the session? If it's shared, is there a way to make it local per thread?
Thanks
Netta

Temporary tables are visible to all operations in the same session. So you cannot create a temporary table of the same name in the same session before you drop the one that exists (commit the transaction in your case).
You may want to use:
CREATE TEMP TABLE tmptbl IF NOT EXISTS ...
More about CREATE TABLE in the manual.
Unique temp tables
To make the temp table local per "thread" (in the same session) you need to use unique table names. One way would be to use an unbound SEQUENCE and dynamic SQL - in a procedural language like plpgsql or in a DO statement (which is basically the same without storing a function.
Run one:
CREATE SEQUENCE myseq;
Use:
DO $$
BEGIN
EXECUTE 'CREATE TABLE tmp' || nextval('myseq') ||'(id int)';
END;
$$
To know the latest table name:
SELECT 'tmp' || currval('myseq');
Or put it all into a plpgsql function and return the table or reuse the table name.
All further SQL commands have to be executed dynamically, though, as plain SQL statements operate with hard coded identifiers. So, it is probably best, to put it all into a plpgsql function.
Unique ID to use same temp table
Another possible solution could be to use the same temp table for all threads in the same session and add a column thread_id to the table. Be sure to index the column, if you make heavy use of the feature. Then use a unique thread_id per thread (in the same session).
Once only:
CREATE SEQUENCE myseq;
Once per thread:
CREATE TEMP TABLE tmptbl(thread_id int, col1 int) IF NOT EXISTS;
my_id := nextval('myseq'); -- in plpgsql
-- else find another way to assign unique id per thread
SQL:
INSERT INTO tmptbl(thread_id, col1) VALUES
(my_id, 2), (my_id, 3), (my_id, 4);
SELECT * FROM tmptbl WHERE thread_id = my_id;

Related

How to implement Lookup Tables?

I am unable to grasp the concept of the lookup table.
I am currently working on a project wherein I am using two tables.
The first table consists of two columns- name(varchar) and value(varchar).
The second table also has two rows- Result(varchar) and value(varchar).
Result is used to store the values which are obtained from a Java code. Whenever the Result of the Java code matches the name in the first table, I need to update the second table with the corresponding value in the first table.
Does using lookup table help in any way? If it does, can it be explained with an example?If not, is there any other way?

Just imagine a table person with a column GenderIsMale BIT. You can set this value to 1 (yes, it is a boy) or to 0 (no, a girl). This was easy in earlier days.
Now we have more categories. According to this link facebook offers more than 50 differing categories...
There the lookup-table comes into play: You create a table which has - as minium - a unique key and a value. In most cases this is an ID INT IDENTITY and a Content VARCHAR(100) NOT NULL. You can add more columns like Abbreviation or any other additional content (e.g. other languages or codes of external code systems read about mapping tables also) directly bound to this value.
The next step is, to take the GenderIsMale-column away and replace it with a
GenderID INT NOT NULL
CONSTRAINT FK_Person_GenderID FOREIGN KEY REFERENCES GenderLookUpTable(GenderID)
The person table will store the GenderID only, the related values are stored in the side table and can be looked up.
The simple lookup table is the basic construct of how to create a relational database model in min. 3.NF or BCNF (which should be a minium reuqirement for professional database design).

Whenever the Result of the Java code matches the name in the first
table, I need to update the second table with the corresponding value
in the first table.
That's a perfect use case for database trigger, which can be used to perform various things when a change (insert, update, delete) happens in a table.
Assuming you're inserting the value of your Java calculations to your (result, value) table (let's call it foo, and the other table is bar), you can write a trigger that replaces the value being written with the value from the other table. Example given for Postgres, if using another db refer to your particular RDBMS manual to see the syntax.
CREATE FUNCTION get_value_from_lookup_table() RETURNS trigger AS $$
BEGIN
IF EXISTS (SELECT 1 FROM bar WHERE name = NEW.result) THEN
RETURN SELECT name, value FROM bar WHERE name = NEW.result;
END IF;
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER lookup_value
INSTEAD OF INSERT ON foo
FOR EACH ROW
EXECUTE PROCEDURE get_value_from_lookup_table();
Every time an INSERT is done on foo, a check is done to see if a row exists in bar where name=result. If so, that row is inserted, otherwise the insert goes on normally. That's the basic gist of it. The actual solution depends on table constraints, whether you need to handle inserts and updates, etc.

Automatically connect PK and FK from a list

I'm developing a java application that saves information in DB using procedures. I will give an example to show my doubt cause i'm kinda lost!
Lets pretend that i have this 2 different classes
public class Seg{
//variables
....
public class Dur{
//variables
private List list<Seg> //Lets pretend that Dur1 has 3 seg, and Dur1's PK = 1
....
And i want to save the information in DB. As the Dur1 has 3 seg and code PK=1 , so i will have 3 insert in seg that has a FK = 1 = Dur's PK
And my question is how can i automically, using a procedure, put a FK in the three seg inserts, assuming that (in java) i know all the matches between Seg and Dur(i have the list that connect them)
//Note: The pk is a attribute defined in the procedures with a sequence
I fear that some may not understand the question but in fact im a little bit confuse
Thanks all

Your example (and the focus on the FK) makes it not clear if you try define a plain PL/SQL layer to handle elementary CRUD (in PL/SQL called also TAPI) or if you intend to encapsulate some kind of business logic.
In the former case you may rethink your approach and have a look on some kind of ORM.
Don't understand me incorrect, I'm not trying to answer your question with "do something else". My point is, there are tons of experience with your situation (database assigned keys) in ORM, so simple search links similar to the above and adapt it to your PL/SQL solution.
In my opinion, you will need to provide an output parameter in the procedure storing the parent class returning the sequence assigned PK and pass this value in the procedure for storing the child classes.

Hey i tried to illustrate your scenario with an example. Hope it
helps. Please pardon any syntax error since i dont have workspace
currently.
--Drop any existing object with same name
DROP TABLE A1PK;
DROP SEQUENCE A1PK_seq;
-- Seq creation
CREATE SEQUENCE A1PK_seq START WITH 1 INCREMENT BY 1;
-- Provding req privileges
GRANT SELECT ON INFRA_OWNER.A1PK_seq TO PUBLIC;
--Root table creation
CREATE TABLE AIK
(PK_ID NUMBER PRIMARY KEY,
PK_NAME VARCHAR2(100));
--Drop existing object
DROP TABLE FK1;
--Create Child table
CREATE TABLE FK1
(
PK_ID NUMBER,
PK_ADD1 VARCHAR2(100),
PK_ADD2 VARCHAR2(100)
);
--Drop any existing constraints if any with same name
ALTER TABLE FK1
DROP CONSTRAINT FK_PK;
--Adding foreign key for child table
ALTER TABLE FK1
ADD CONSTRAINT FK_PK FOREIGN KEY(PK_ID) REFERENCES AIK(PK_ID);
CREATE OR REPLACE PROCEDURE insert_into_child_tables
(p_seg1 IN VARCHAR2,
p_seg2 IN VARCHAR2,
p_seg3 IN VARCHAR2,
p_root_val IN VARCHAR2)
AS
lv_long LONG;
lv_seq PLS_INTEGER;
BEGIN
SELECT INFRA_OWNER.A1PK_SEQ.NEXTVAL
INTO lv_seq
FROM DUAL;
INSERT
INTO INFRA_OWNER.AIK VALUES
(
lv_seq,
p_root_val
);
FOR I IN
(SELECT a1.OWNER,
a1.CONSTRAINT_NAME,
a1.TABLE_NAME
FROM ALL_CONSTRAINTS a1
WHERE A1.R_CONSTRAINT_NAME IN
(SELECT a2.CONSTRAINT_NAME
FROM ALL_CONSTRAINTS a2
WHERE a2.TABLE_NAME = 'AIK'
AND a2.constraint_type = 'P'
)
ORDER BY A1.TABLE_NAME
)
LOOP
EXECUTE IMMEDIATE 'INSERT INTO '||I.OWNER||'.'||I.TABLE_NAME||' VALUES ('||lv_seq||','||''''||lv_seg1||''''||','||''''||lv_seg2||''''||')';
EXECUTE IMMEDIATE 'INSERT INTO '||I.OWNER||'.'||I.TABLE_NAME||' VALUES ('||lv_seq||','||''''||lv_seg1||''''||','||''''||lv_seg2||''''||')';
EXECUTE IMMEDIATE 'INSERT INTO '||I.OWNER||'.'||I.TABLE_NAME||' VALUES ('||lv_seq||','||''''||lv_seg1||''''||','||''''||lv_seg2||''''||')';
END LOOP;
END;

How to PreparedStatement sql with ON DUPLICATE KEY UPDATE? [duplicate]

Several months ago I learned from an answer on Stack Overflow how to perform multiple updates at once in MySQL using the following syntax:
INSERT INTO table (id, field, field2) VALUES (1, A, X), (2, B, Y), (3, C, Z)
ON DUPLICATE KEY UPDATE field=VALUES(Col1), field2=VALUES(Col2);
I've now switched over to PostgreSQL and apparently this is not correct. It's referring to all the correct tables so I assume it's a matter of different keywords being used but I'm not sure where in the PostgreSQL documentation this is covered.
To clarify, I want to insert several things and if they already exist to update them.

PostgreSQL since version 9.5 has UPSERT syntax, with ON CONFLICT clause. with the following syntax (similar to MySQL)
INSERT INTO the_table (id, column_1, column_2)
VALUES (1, 'A', 'X'), (2, 'B', 'Y'), (3, 'C', 'Z')
ON CONFLICT (id) DO UPDATE
SET column_1 = excluded.column_1,
column_2 = excluded.column_2;
Searching postgresql's email group archives for "upsert" leads to finding an example of doing what you possibly want to do, in the manual:
Example 38-2. Exceptions with UPDATE/INSERT
This example uses exception handling to perform either UPDATE or INSERT, as appropriate:
CREATE TABLE db (a INT PRIMARY KEY, b TEXT);
CREATE FUNCTION merge_db(key INT, data TEXT) RETURNS VOID AS
$$
BEGIN
LOOP
-- first try to update the key
-- note that "a" must be unique
UPDATE db SET b = data WHERE a = key;
IF found THEN
RETURN;
END IF;
-- not there, so try to insert the key
-- if someone else inserts the same key concurrently,
-- we could get a unique-key failure
BEGIN
INSERT INTO db(a,b) VALUES (key, data);
RETURN;
EXCEPTION WHEN unique_violation THEN
-- do nothing, and loop to try the UPDATE again
END;
END LOOP;
END;
$$
LANGUAGE plpgsql;
SELECT merge_db(1, 'david');
SELECT merge_db(1, 'dennis');
There's possibly an example of how to do this in bulk, using CTEs in 9.1 and above, in the hackers mailing list:
WITH foos AS (SELECT (UNNEST(%foo[])).*)
updated as (UPDATE foo SET foo.a = foos.a ... RETURNING foo.id)
INSERT INTO foo SELECT foos.* FROM foos LEFT JOIN updated USING(id)
WHERE updated.id IS NULL;
See a_horse_with_no_name's answer for a clearer example.

Warning: this is not safe if executed from multiple sessions at the same time (see caveats below).
Another clever way to do an "UPSERT" in postgresql is to do two sequential UPDATE/INSERT statements that are each designed to succeed or have no effect.
UPDATE table SET field='C', field2='Z' WHERE id=3;
INSERT INTO table (id, field, field2)
SELECT 3, 'C', 'Z'
WHERE NOT EXISTS (SELECT 1 FROM table WHERE id=3);
The UPDATE will succeed if a row with "id=3" already exists, otherwise it has no effect.
The INSERT will succeed only if row with "id=3" does not already exist.
You can combine these two into a single string and run them both with a single SQL statement execute from your application. Running them together in a single transaction is highly recommended.
This works very well when run in isolation or on a locked table, but is subject to race conditions that mean it might still fail with duplicate key error if a row is inserted concurrently, or might terminate with no row inserted when a row is deleted concurrently. A SERIALIZABLE transaction on PostgreSQL 9.1 or higher will handle it reliably at the cost of a very high serialization failure rate, meaning you'll have to retry a lot. See why is upsert so complicated, which discusses this case in more detail.
This approach is also subject to lost updates in read committed isolation unless the application checks the affected row counts and verifies that either the insert or the update affected a row.

With PostgreSQL 9.1 this can be achieved using a writeable CTE (common table expression):
WITH new_values (id, field1, field2) as (
values
(1, 'A', 'X'),
(2, 'B', 'Y'),
(3, 'C', 'Z')
),
upsert as
(
update mytable m
set field1 = nv.field1,
field2 = nv.field2
FROM new_values nv
WHERE m.id = nv.id
RETURNING m.*
)
INSERT INTO mytable (id, field1, field2)
SELECT id, field1, field2
FROM new_values
WHERE NOT EXISTS (SELECT 1
FROM upsert up
WHERE up.id = new_values.id)
See these blog entries:
Upserting via Writeable CTE
WAITING FOR 9.1 – WRITABLE CTE
WHY IS UPSERT SO COMPLICATED?
Note that this solution does not prevent a unique key violation but it is not vulnerable to lost updates.
See the follow up by Craig Ringer on dba.stackexchange.com

In PostgreSQL 9.5 and newer you can use INSERT ... ON CONFLICT UPDATE.
See the documentation.
A MySQL INSERT ... ON DUPLICATE KEY UPDATE can be directly rephrased to a ON CONFLICT UPDATE. Neither is SQL-standard syntax, they're both database-specific extensions. There are good reasons MERGE wasn't used for this, a new syntax wasn't created just for fun. (MySQL's syntax also has issues that mean it wasn't adopted directly).
e.g. given setup:
CREATE TABLE tablename (a integer primary key, b integer, c integer);
INSERT INTO tablename (a, b, c) values (1, 2, 3);
the MySQL query:
INSERT INTO tablename (a,b,c) VALUES (1,2,3)
ON DUPLICATE KEY UPDATE c=c+1;
becomes:
INSERT INTO tablename (a, b, c) values (1, 2, 10)
ON CONFLICT (a) DO UPDATE SET c = tablename.c + 1;
Differences:
You must specify the column name (or unique constraint name) to use for the uniqueness check. That's the ON CONFLICT (columnname) DO
The keyword SET must be used, as if this was a normal UPDATE statement
It has some nice features too:
You can have a WHERE clause on your UPDATE (letting you effectively turn ON CONFLICT UPDATE into ON CONFLICT IGNORE for certain values)
The proposed-for-insertion values are available as the row-variable EXCLUDED, which has the same structure as the target table. You can get the original values in the table by using the table name. So in this case EXCLUDED.c will be 10 (because that's what we tried to insert) and "table".c will be 3 because that's the current value in the table. You can use either or both in the SET expressions and WHERE clause.
For background on upsert see How to UPSERT (MERGE, INSERT ... ON DUPLICATE UPDATE) in PostgreSQL?

I was looking for the same thing when I came here, but the lack of a generic "upsert" function botherd me a bit so I thought you could just pass the update and insert sql as arguments on that function form the manual
that would look like this:
CREATE FUNCTION upsert (sql_update TEXT, sql_insert TEXT)
RETURNS VOID
LANGUAGE plpgsql
AS $$
BEGIN
LOOP
-- first try to update
EXECUTE sql_update;
-- check if the row is found
IF FOUND THEN
RETURN;
END IF;
-- not found so insert the row
BEGIN
EXECUTE sql_insert;
RETURN;
EXCEPTION WHEN unique_violation THEN
-- do nothing and loop
END;
END LOOP;
END;
$$;
and perhaps to do what you initially wanted to do, batch "upsert", you could use Tcl to split the sql_update and loop the individual updates, the preformance hit will be very small see http://archives.postgresql.org/pgsql-performance/2006-04/msg00557.php
the highest cost is executing the query from your code, on the database side the execution cost is much smaller

There is no simple command to do it.
The most correct approach is to use function, like the one from docs.
Another solution (although not that safe) is to do update with returning, check which rows were updates, and insert the rest of them
Something along the lines of:
update table
set column = x.column
from (values (1,'aa'),(2,'bb'),(3,'cc')) as x (id, column)
where table.id = x.id
returning id;
assuming id:2 was returned:
insert into table (id, column) values (1, 'aa'), (3, 'cc');
Of course it will bail out sooner or later (in concurrent environment), as there is clear race condition in here, but usually it will work.
Here's a longer and more comprehensive article on the topic.

I use this function merge
CREATE OR REPLACE FUNCTION merge_tabla(key INT, data TEXT)
RETURNS void AS
$BODY$
BEGIN
IF EXISTS(SELECT a FROM tabla WHERE a = key)
THEN
UPDATE tabla SET b = data WHERE a = key;
RETURN;
ELSE
INSERT INTO tabla(a,b) VALUES (key, data);
RETURN;
END IF;
END;
$BODY$
LANGUAGE plpgsql

Personally, I've set up a "rule" attached to the insert statement. Say you had a "dns" table that recorded dns hits per customer on a per-time basis:
CREATE TABLE dns (
"time" timestamp without time zone NOT NULL,
customer_id integer NOT NULL,
hits integer
);
You wanted to be able to re-insert rows with updated values, or create them if they didn't exist already. Keyed on the customer_id and the time. Something like this:
CREATE RULE replace_dns AS
ON INSERT TO dns
WHERE (EXISTS (SELECT 1 FROM dns WHERE ((dns."time" = new."time")
AND (dns.customer_id = new.customer_id))))
DO INSTEAD UPDATE dns
SET hits = new.hits
WHERE ((dns."time" = new."time") AND (dns.customer_id = new.customer_id));
Update: This has the potential to fail if simultaneous inserts are happening, as it will generate unique_violation exceptions. However, the non-terminated transaction will continue and succeed, and you just need to repeat the terminated transaction.
However, if there are tons of inserts happening all the time, you will want to put a table lock around the insert statements: SHARE ROW EXCLUSIVE locking will prevent any operations that could insert, delete or update rows in your target table. However, updates that do not update the unique key are safe, so if you no operation will do this, use advisory locks instead.
Also, the COPY command does not use RULES, so if you're inserting with COPY, you'll need to use triggers instead.

Similar to most-liked answer, but works slightly faster:
WITH upsert AS (UPDATE spider_count SET tally=1 WHERE date='today' RETURNING *)
INSERT INTO spider_count (spider, tally) SELECT 'Googlebot', 1 WHERE NOT EXISTS (SELECT * FROM upsert)
(source: http://www.the-art-of-web.com/sql/upsert/)

I custom "upsert" function above, if you want to INSERT AND REPLACE :
`
CREATE OR REPLACE FUNCTION upsert(sql_insert text, sql_update text)
RETURNS void AS
$BODY$
BEGIN
-- first try to insert and after to update. Note : insert has pk and update not...
EXECUTE sql_insert;
RETURN;
EXCEPTION WHEN unique_violation THEN
EXECUTE sql_update;
IF FOUND THEN
RETURN;
END IF;
END;
$BODY$
LANGUAGE plpgsql VOLATILE
COST 100;
ALTER FUNCTION upsert(text, text)
OWNER TO postgres;`
And after to execute, do something like this :
SELECT upsert($$INSERT INTO ...$$,$$UPDATE... $$)
Is important to put double dollar-comma to avoid compiler errors
check the speed...

According the PostgreSQL documentation of the INSERT statement, handling the ON DUPLICATE KEY case is not supported. That part of the syntax is a proprietary MySQL extension.

I have the same issue for managing account settings as name value pairs.
The design criteria is that different clients could have different settings sets.
My solution, similar to JWP is to bulk erase and replace, generating the merge record within your application.
This is pretty bulletproof, platform independent and since there are never more than about 20 settings per client, this is only 3 fairly low load db calls - probably the fastest method.
The alternative of updating individual rows - checking for exceptions then inserting - or some combination of is hideous code, slow and often breaks because (as mentioned above) non standard SQL exception handling changing from db to db - or even release to release.
#This is pseudo-code - within the application:
BEGIN TRANSACTION - get transaction lock
SELECT all current name value pairs where id = $id into a hash record
create a merge record from the current and update record
(set intersection where shared keys in new win, and empty values in new are deleted).
DELETE all name value pairs where id = $id
COPY/INSERT merged records
END TRANSACTION

CREATE OR REPLACE FUNCTION save_user(_id integer, _name character varying)
RETURNS boolean AS
$BODY$
BEGIN
UPDATE users SET name = _name WHERE id = _id;
IF FOUND THEN
RETURN true;
END IF;
BEGIN
INSERT INTO users (id, name) VALUES (_id, _name);
EXCEPTION WHEN OTHERS THEN
UPDATE users SET name = _name WHERE id = _id;
END;
RETURN TRUE;
END;
$BODY$
LANGUAGE plpgsql VOLATILE STRICT

For merging small sets, using the above function is fine. However, if you are merging large amounts of data, I'd suggest looking into http://mbk.projects.postgresql.org
The current best practice that I'm aware of is:
COPY new/updated data into temp table (sure, or you can do INSERT if the cost is ok)
Acquire Lock [optional] (advisory is preferable to table locks, IMO)
Merge. (the fun part)

UPDATE will return the number of modified rows. If you use JDBC (Java), you can then check this value against 0 and, if no rows have been affected, fire INSERT instead. If you use some other programming language, maybe the number of the modified rows still can be obtained, check documentation.
This may not be as elegant but you have much simpler SQL that is more trivial to use from the calling code. Differently, if you write the ten line script in PL/PSQL, you probably should have a unit test of one or another kind just for it alone.

Edit: This does not work as expected. Unlike the accepted answer, this produces unique key violations when two processes repeatedly call upsert_foo concurrently.
Eureka! I figured out a way to do it in one query: use UPDATE ... RETURNING to test if any rows were affected:
CREATE TABLE foo (k INT PRIMARY KEY, v TEXT);
CREATE FUNCTION update_foo(k INT, v TEXT)
RETURNS SETOF INT AS $$
UPDATE foo SET v = $2 WHERE k = $1 RETURNING $1
$$ LANGUAGE sql;
CREATE FUNCTION upsert_foo(k INT, v TEXT)
RETURNS VOID AS $$
INSERT INTO foo
SELECT $1, $2
WHERE NOT EXISTS (SELECT update_foo($1, $2))
$$ LANGUAGE sql;
The UPDATE has to be done in a separate procedure because, unfortunately, this is a syntax error:
... WHERE NOT EXISTS (UPDATE ...)
Now it works as desired:
SELECT upsert_foo(1, 'hi');
SELECT upsert_foo(1, 'bye');
SELECT upsert_foo(3, 'hi');
SELECT upsert_foo(3, 'bye');

PostgreSQL >= v15
Big news on this topic as in PostgreSQL v15, it is possible to use MERGE command. In fact, this long awaited feature was listed the first of the improvements of the v15 release.
This is similar to INSERT ... ON CONFLICT but more batch-oriented. It has a powerful WHEN MATCHED vs WHEN NOT MATCHED structure that gives the ability to INSERT, UPDATE or DELETE on such conditions.
It not only eases bulk changes, but it even adds more control that tradition UPSERT and INSERT ... ON CONFLICT
Take a look at this very complete sample from official page:
MERGE INTO wines w
USING wine_stock_changes s
ON s.winename = w.winename
WHEN NOT MATCHED AND s.stock_delta > 0 THEN
INSERT VALUES(s.winename, s.stock_delta)
WHEN MATCHED AND w.stock + s.stock_delta > 0 THEN
UPDATE SET stock = w.stock + s.stock_delta
WHEN MATCHED THEN
DELETE;
PostgreSQL v9, v10, v11, v12, v13, v14
If version is under v15 and over v9.5 , probably best choice is to use UPSERT syntax, with ON CONFLICT clause

Here is the example how to do upsert with params and without special sql constructions
if you have special condition (sometimes you can't use 'on conflict' because you can't create constraint)
WITH upd AS
(
update view_layer set metadata=:metadata where layer_id = :layer_id and view_id = :view_id returning id
)
insert into view_layer (layer_id, view_id, metadata)
(select :layer_id layer_id, :view_id view_id, :metadata metadata FROM view_layer l
where NOT EXISTS(select id FROM upd WHERE id IS NOT NULL) limit 1)
returning id
maybe it will be helpful

Audit history of multiple tables in the database

I have 3-4 tables in my database which I want to track the changes for.
I am mainly concerned about updates.
Whenever updates happen, I want to store previous entry (value or complete row) in audit table.
Basic columns I was thinking of are as following:
AuditId, TableName, PK1, PK2, PK3, PKVal1, PKVal2, PKVal3, UpdateType, PrevEntryJSON
JSON will be of format: Key:Value and I preferred to go with it as columns keep on changing and I want to keep all values even if they don't change.
Other option is to remove JSON with 100's of columns which will have names same as different columns (cumulative of all tables).
I wanted to hear people's views on this. How could I improve on it and what issues could I face?
Going through triggers might not be preferable way but I am open to it.
Thanks,

I have seen a very effective implementation of this which goes as follows:
TABLE audit_entry (
audit_entry_id INTEGER PRIMARY KEY,
audit_entry_type VARCHAR2(10) NOT NULL,
-- ^^ stores 'INSERT' / 'UPDATE' -- / 'DELETE'
table_name VARCHAR2(30) NOT NULL,
-- ^^ stores the name of the table that is changed
column_name VARCHAR2(30) NOT NULL,
-- ^^ stores the name of the column that is changed
primary_key_id INTEGER NOT NULL,
-- ^^ Primary key ID to identify the row that is changed
-- Below are the actual values that are changed.
-- If the changed column is a foreign key ID then
-- below columns tell you which is new and which is old
old_id INTEGER,
new_id INTEGER,
-- If the changed column is of any other numeric type,
-- store the old and new values here.
-- Modify the precision and scale of NUMBER as per your
-- choice.
old_number NUMBER(18,2),
new_number NUMBER(18,2),
-- If the changed column is of date type, with or without
-- time information, store it here.
old_ts TIMESTAMP,
new_ts TIMESTAMP,
-- If the changed column is of VARCHAR2 type,
-- store it here.
old_varchar VARCHAR2(2000),
new_varchar VARCHAR2(2000),
...
... -- Any other columns to store data of other types,
... -- e.g., blob, xmldata, etc.
...
)
And we create a simple sequence to give us new incremental integer value for audit_entry_id:
CREATE SEQUENCE audit_entry_id_seq;
The beauty of a table like audit_entry is that you can store information about all types of DMLs- INSERT, UPDATE and DELETE in the same place.
For e.g., for insert, keep the old_* columns null and populate the new_* with your values.
For updates, populate both old_* and new_* columns whenever they are changed.
For delete, just populate the old_* columns and keep the new_* null.
And of course, enter the appropriate value for audit_entry_type. ;0)
Then, for example, you have a table like follows:
TABLE emp (
empno INTEGER,
ename VARCHAR2(100) NOT NULL,
date_of_birth DATE,
salary NUMBER(18,2) NOT NULL,
deptno INTEGER -- FOREIGN KEY to, say, department
...
... -- Any other columns that you may fancy.
...
)
Just create a trigger on this table as follows:
CREATE OR REPLACE TRIGGER emp_rbiud
-- rbiud means Row level, Before Insert, Update, Delete
BEFORE INSERT OR UPDATE OR DELETE
ON emp
REFERENCING NEW AS NEW OLD AS OLD
DECLARE
-- any variable declarations that deem fit.
BEGIN
WHEN INSERTING THEN
-- Of course, you will insert empno.
-- Let's populate other columns.
-- As emp.ename is a not null column,
-- let's insert the audit entry value directly.
INSERT INTO audit_entry(audit_entry_id,
audit_entry_type,
table_name,
column_name,
primary_key,
new_varchar)
VALUES(audit_entry_id_seq.nextval,
'INSERT',
'EMP',
'ENAME',
:new.empno,
:new.ename);
-- Now, as date_of_birth may contain null, we do:
IF :new.date_of_birth IS NOT NULL THEN
INSERT INTO audit_entry(audit_entry_id,
audit_entry_type,
table_name,
column_name,
primary_key,
new_ts)
VALUES(audit_entry_id_seq.nextval,
'INSERT',
'EMP',
'DATE_OF_BIRTH',
:new.empno,
:new.date_of_birth);
END IF;
-- Similarly, code DML statements for auditing other values
-- as per your requirements.
WHEN UPDATING THEN
-- This is a tricky one.
-- You must check which columns have been updated before you
-- hurry into auditing their information.
IF :old.ename != :new.ename THEN
INSERT INTO audit_entry(audit_entry_id,
audit_entry_type,
table_name,
column_name,
primary_key,
old_varchar,
new_varchar)
VALUES(audit_entry_id_seq.nextval,
'INSERT',
'EMP',
'ENAME',
:new.empno,
:old.ename,
:new.ename);
END IF;
-- Code further DML statements in similar fashion for other
-- columns as per your requirement.
WHEN DELETING THEN
-- By now you must have got the idea about how to go about this.
-- ;0)
END;
/
Just one word of caution: be selective with what tables and columns you choose to audit, because anyways, you this table will have a huge number of rows. SELECT statements on this table will be slower than you may expect.
I would really love to see any other sort of implementation here, as it would be a good learning experience. Hope your question gets more answers, as this is the best implementation of an audit table that I have seen and I'm still looking for ways to make it better.

Create new table entry with new id

The problem
I have a table for some data that has an ID column of type integer (which is also the primary key).
When a new data entry is added to the table, it should get a new ID whereas the ID is not known by the application that inserts the object but it should be given by the database. For example, the IDs should be assigned like 0, 1, 2, ...
Assume that I have all other data for the new entry, how would I do the insert? Normally:
insert into T values(123, 'data');
But now I don't know what to put instead of 123
- would you create some kind of global variable NEXTID in the database that provides the IDs and query/update this value each time before inserting into T?
The questions
How to handle this kind of problem? A solution that is concurrency save is preferable.
How to achieve this with Java/myBatis? I Have a Java class that corresponds to the table structure and a new object should be added to the database, getting a new ID automatically.
Update
What I searched for was auto-increment.
Is there a standard SQL way (database independent) of declaring a column as auto-increment? I am using Apache Derby and GENERATED ALWAYS AS IDENTITY (START WITH 1, INCREMENT BY 1) is suggested here.
How does the insert to a table that contains auto-increment columns look like?
What is the best way to get the created auto-increment value after an insert when simultaneaous access to the database is possible?
I'll accept an answer that includes explanation and SQL instructions for declaration and insertion :)

If you are using sqlserver, making column of identity type will solve the purpose something like this
.
ALTER TABLE [dbo].[T] ADD [Column1] INT identity (1, 1)
For others like oracle you can for simple database sequence.

In MySQL you can use
ALTER TABLE table_name ADD id INT AUTO_INCREMENT;
this auto increment the id column, you don't have to give in insert.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.