How to delete multiple rows from multiple tables using Where clause? - java

Using an Oracle DB, I need to select all the IDs from a table where a condition exists, then delete the rows from multiple tables where that ID exists. The pseudocode would be something like:
SELECT ID FROM TABLE1 WHERE AGE > ?
DELETE FROM TABLE1 WHERE ID = <all IDs received from SELECT>
DELETE FROM TABLE2 WHERE ID = <all IDs received from SELECT>
DELETE FROM TABLE3 WHERE ID = <all IDs received from SELECT>
What is the best and most efficient way to do this?
I was thinking something like the following, but wanted to know if there was a better way.
PreparedStatement selectStmt = conn.prepareStatment("SELECT ID FROM TABLE1 WHERE AGE > ?");
selectStmt.setInt(1, age);
ResultSet rs = selectStmt.executeQuery():
PreparedStatement delStmt1 = conn.prepareStatment("DELETE FROM TABLE1 WHERE ID = ?");
PreparedStatement delStmt2 = conn.prepareStatment("DELETE FROM TABLE2 WHERE ID = ?");
PreparedStatement delStmt3 = conn.prepareStatment("DELETE FROM TABLE3 WHERE ID = ?");
while(rs.next())
{
String id = rs.getString("ID");
delStmt1.setString(1, id);
delStmt1.addBatch();
delStmt2.setString(1, id);
delStmt2.addBatch();
delStmt3.setString(1, id);
delStmt3.addBatch();
}
delStmt1.executeBatch();
delStmt2.executeBatch();
delStmt3.executeBatch();
Is there a better/more efficient way?

You could do it with one DELETE statement if two of your 3 tables (for example "table2" and "table3") are child tables of the parent table (for example "table1") that have a "ON DELETE CASCADE" option.
This means that the two child tables have a column (example column "id" of "table2" and "table3") that has a foreign key constraint with "ON DELETE CASCADE" option that references the primary key column of the parent table (example column "id" of "table1"). This way only deleting from the parent table would automatically delete associated rows in the child tables.
Check out this in more detail : http://www.techonthenet.com/oracle/foreign_keys/foreign_delete.php

If you delete only few records of a large tables ensure that an index on the
column ID is defined.
To delete the records from the table TABLE2 and 3 the best strategy is to use the CASCADE DELETE as proposed by
#ivanzg - if this is not possible, see below.
To delete from TABLE1 a far superior option that a batch delete on a row basis, use signle delete using the age based predicate:
PreparedStatement stmt = con.prepareStatement("DELETE FROM TABLE1 WHERE age > ?")
stmt.setInt(1,60)
Integer rowCount = stmt.executeUpdate()
If you can't cascade delete, use for the table2 and 3 the same concept as above but with the following statment:
DELETE FROM TABLE2/*or 3*/ WHERE ID in (SELECT ID FROM TABLE1 WHERE age > ?)
General best practice - minimum logic in client, whole logic in the database server. The database should be able to do reasonable execution plan
- see the index note above.

DELETE statement operates a table per statement. However the main implementations support triggers or other mechanisms that perform subordinate modifications. For example Oracle's CREATE TRIGGER.
However developers might end up figuring out what is the database doing behind their backs. (When/Why to use Cascading in SQL Server?)
Alternatively, if you need to use an intermediate result in your delete statements. You might use a temporal table in your batch (as proposed here).
As a side note, I see not transaction control (setAutoCommit(false) ... commit() in your example code. I guess that might be for the sake of simplicity.
Also you are executing 3 different delete batches (one for each table) instead of one. That might negate the benefit of using PreparedStatement.

Related

How can I access a value when inserting into a table?

I'm trying to write a java sql query, the simplified table would be table(name,version) with a unique constraint on (name, version).
I'm trying to insert a row into my database with a conditional statement. Meaning that when a entry with the same name exists, it should insert the row with same name and its version increased by 1.
I have tried with the following:
INSERT INTO table(name,version)
VALUES(?, CASE WHEN EXISTS(SELECT name from table where name=?)
THEN (SELECT MAX(version) FROM table WHERE name = ?) +1
ELSE 1 END)
values are sent by user.
My question is, how can I access the 'name' inside the values so I could compare them?
If you want to write this as a single query:
INSERT INTO table (name, version)
SELECT ?, COLAESCE(MAX(t2.version) + 1, 1)
FROM table t2
WHERE t2.name = ?;
That said, this is dangerous. Two threads could execute this query "at the same time" and possibly create the same version number. You can prevent this from happening by adding a unique index/constraint on (name, version).
With the unique index/constraint, one of the updates will fail if there is a conflict.
I see at least two approaches:
1. For each pair of name and version you first query the max version:
SELECT MAX(VERSION) as MAX FROM <table> WHERE NAME = <name>
And then you insert the result + 1 with a corresponding insert query:
INSERT INTO <table>(NAME,VERSION) VALUES (<name>,result+1)
This approach is very straight-forward, easy-to-read and implement, however, not really performant because of so many queries necessary.
You can achieve that with sql alone with sql analytics and window functions, e.g.:
SELECT NAME, ROW_NUMBER() over (partition BY NAME ORDER BY NAME) as VERSION FROM<table>
You can then save the result of this query as a table using CREATE TABLE as SELECT...
(The assumption here is that the first version is 1, if it is not the case, then one could slightly rework the query). This solution would be very performant even for large datasets.
You should get the name before insertion. In your case, if something went wrong then how would you know about it so you get the name before insert query.
Not sure but you try this:
declare int version;
if exists(SELECT name from table where name=?)
then
version = SELECT MAX(version) FROM table WHERE name = ?
version += 1
else
version = 1
end
Regards.
This is actually a bad plan, you might be changing what the user's specified data. That is likely to not be what is desired, maybe they're not trying to create a new version but just unaware that the one wanted already exists. But, you can create a function, which your java calls, not only inserts the requested version or max+1 if the requested version already exists. Moreover it returns the actual values inserted.
-- create table
create table nv( name text
, version integer
, constraint nv_uk unique (name, version)
);
-- function to create version or 1+max if requested exists
create or replace function new_version
( name_in text
, version_in integer
)
returns record
language plpgsql strict
as $$
declare
violated_constraint text;
return_name_version record;
begin
insert into nv(name,version)
values (name_in,version_in)
returning (name, version) into return_name_version;
return return_name_version;
exception
when unique_violation
then
GET STACKED DIAGNOSTICS violated_constraint = CONSTRAINT_NAME;
if violated_constraint like '%nv\_uk%'
then
insert into nv(name,version)
select name_in, 1+max(version)
from nv
where name = name_in
group by name_in
returning (name, version) into return_name_version;
return return_name_version;
end if;
end;
$$;
-- create some data
insert into nv(name,version)
select 'n1', gn
from generate_series( 1,3) gn ;
-- test insert existing
select new_version('n2',1);
select new_version('n1',1);
select *
from nv
order by name, version;

Java sql - delete half rows in DB table

I want to delete a bunch of rows from a DB file that I have in a folder. Connecting and counting the amount of rows in the db file works but when I try to delete a specific amount of rows I get stuck.
Input:
sql = "SELECT COUNT(*) AS id FROM wifi_probe_requests";
...
sql = "DELETE FROM wifi_probe_requests LIMIT " + rowcount/2;
PreparedStatement pstmt = conn.prepareStatement(sql);
pstmt.executeUpdate();
Output:
54943
[SQLITE_ERROR] SQL error or missing database (near "LIMIT": syntax error)
Not using a limit works fine and I can delete the entire db table but what I want is to delete half the db rows as seen by the rowcount/2 I made.
UPDATE:
So far I have solved the problem by finding the id which is located at the n-rows/2 and then getting the value of it (ex. 264352). Then using that number to indicate what id rows are going to be deleted (ex. id.value < 264352).
sql = "SELECT COUNT(*) AS id FROM wifi_probe_requests";
int rowcount = COUNT(*);
sql = "DELETE FROM wifi_probe_requests WHERE id < (SELECT id FROM wifi_probe_requests ORDER BY id ASC LIMIT "+ rowcount/2 + ",1)";
rowcount = 50000
Delete valueof.id < valueof.id.50000/2
So all values of id below the value of an id at position 25000 will be deleted.
You can't. Some databases don't allow LIMIT in UPDATE or DELETE queries.
It seems that with SQLite it's possible to work around that, by compiling your own version, but if you're not willing to do that, you need to rewrite your query in a different way. For example if you have an autoincrement id in the table, you can calculate the "middle" id and use WHERE id < [middle id] as an alternative to LIMIT.
As stated by #Kayaman this is not possible using SQLITE.
You can bypass this with a query such as;
DELETE FROM wifi_probe_requests WHERE id IN (SELECT id FROM wifi_probe_requests LIMIT 10)
One more thing; I don't think (rowcount/2) will work when you have an uneven amount of rows as it will not result in an integer. I think you will have to round it down/up.
How fancy do you want to make this? A simple solution would be something like:
SELECT COUNT(*) FROM mytable;
"SELECT id FROM mytable order by id LIMIT 1 OFFSET " + round(rowcount/2)
DELETE FROM mytable WHERE id < ?
If you go that route, you should be able to delete the first half of your rows by keyspace. If you just want just about half your rows deleted (and don't really care how many) you could probably find a way to use RANDOM() to do this. Probably like (WARNING TOTALLY UNTESTED):
DELETE FROM mytable WHERE random() < 0.5;

JDBC - PostgreSQL - batch insert + unique index

I have a table with unique constraint on some field. I need to insert a large number of records in this table. To make it faster I'm using batch update with JDBC (driver version is 8.3-603).
Is there a way to do the following:
every batch execute I need to write into the table all the records from the batch that don't violate the unique index;
every batch execute I need to receive the records from the batch that were not inserted into DB, so I could save "wrong" records
?
The most efficient way of doing this would be something like this:
create a staging table with the same structure as the target table but without the unique constraint
batch insert all rows into that staging table. The most efficient way is to use copy or use the CopyManager (although I don't know if that is already supported in your ancient driver version.
Once that is done you copy the valid rows into the target table:
insert into target_table(id, col_1, col_2)
select id, col_1, col_2
from staging_table
where not exists (select *
from target_table
where target_table.id = staging_table.id);
Note that the above is not concurrency safe! If other processes do the same thing you might still get unique key violations. To prevent that you need to lock the target table.
If you want to remove the copied rows, you could do that using a writeable CTE:
with inserted as (
insert into target_table(id, col_1, col_2)
select id, col_1, col_2
from staging_table
where not exists (select *
from target_table
where target_table.id = staging_table.id)
returning staging_table.id;
)
delete from staging_table
where id in (select id from inserted);
A (non-unique) index on the staging_table.id should help for the performance.

Get current sequence Id to store in other tables

We have multiple tables and all are related with first table's primary key (example: id). Id is configured as a sequence and while inserting data into to first table we are using sequence.nextval in the insert query.
Now while inserting data to other tables, how to get current sequence value or current Id.
We have tried below options:
sequence.currval, directly in the insert statement
2.select sequence.currval from dual
Above two options throwing error while using getJdbcTemplate().update().
Could anyone please suggest how to get current sequence value to pass to other tables after inserting data into first table??
If you want to insert the same id (which comes from a sequence) to different tables, simple get it form the first insert and use it in the other inserts.
PrepearedStatement stmt1 = conn.prepareStatement("INSERT INTO TABLE1 (id) VALUES(yoursequence.nextval)", Statemet.RETURN_GENERATED_KEYS);
stmt1.executeUpdate();
ResultSet rs = stmt1.getGeneratedKeys();
rs.next();
long id = rs.getLong(1);
PrepearedStatement stmt2 = conn.prepareStatement("INSERT INTO TABLE2 (id) VALUES(?)");
stmt2.setLong(1,id);
stmt2.executeUpdate();

Complex INSERT query

I'm pretty new to MySQL. I have two related tables, quite common case: Klients(KID, name, surname) and Visits(VID, VKID, dateOfVisit) - VKID is the Klient ID. I have a problem with suitable INSERT query, this is what I want to do:
1.Check if Klient with specific name and surname exists (let's assume that there are no people with the same surnames)
2.If yes, get the ID and do the INSERT to Visits table
3.If no, INSERT new Klient, get the ID and INSERT to Visits.
Is it possible to do in one query?
You would need to use the IF EXIST / NOT EXISTS and use a subquery to check the table. See the reference bwlo
http://dev.mysql.com/doc/refman/5.0/en/exists-and-not-exists-subqueries.html
HTH
The INSERT statement allows only one single target table.
So the query you're looking for is just impossible unless you use triggers or stored procedures.
But such problem is commonly solved using the fallowing small algorithm:
1) insert a record in table [Visits] assuming the parent record does exist in table [Klients]
INSERT INTO Visits (VKID, dateOfVisit)
SELECT KID, NOW()
FROM Klients
WHERE (name=#name) AND (surname=#surname)
2) check the number of inserted records after query (1)
3) if no record has been inserted, then add a new record table [Klients], and then run (1) again.
try something like this
IF (SELECT * FROM `sometable` WHERE name = 'somename' AND surname = 'somesurname') IS NULL THEN
INSERT INTO Table1(name,surname) VALUES ('somename', 'somesurname');
ELSE INSERT INTO visits(kid,name,surname)
SELECT kid, name, surname FROM Table1 WHERE name = 'somename' AND surname = 'somesurname';
END IF;
there is no need to specify 'VALUES' on the second insert
i have not tested it, but this is the general idea of what you are trying to accomplish.
These should be two queries in a transaction:
INSERT INTO Klients (name, surname)
VALUES ('John', 'Doe')
ON DUPLICATE KEY UPDATE
KID = LAST_INSERT_ID(KID);
INSERT INTO Visits (VKID, dateOfVisits)
VALUES (LAST_INSERT_ID(), NOW());
The first statement is an upsert statement where the update part uses not widely known, but intented exactly for the purpose functionality of LAST_INSERT_ID(), where explicitly passed value is stored for getting the value afterwards.
UPD: I forgot to mention that you would need to add a unique constraint on (surname, name).

Categories