I need suggestions about a logic. There is an update query in application like below
UPDATE TABLE
SET FLAG = CASE
WHEN FLAG = 'IP' THEN 'P'
WHEN FLAG = 'IH' THEN 'H'
WHEN FLAG = 'IM' THEN 'M'
END
WHERE ADJUSTMENT_ID IN (SELECT Query )
This update is executed from a Java function which returns void.
Now I have a requirement to get details of updated records too (few columns from table TABLE) and return a LIST from function instead of void.
Running SELECT first then updating records in loop is not an option due to performance reasons. Records are updated with a single UPDATE statement because its supposed to run faster.
What would be options for me keeping comparable performance? Should I go with a stored procedure?
SELECT ... FROM FINAL TABLE (UPDATE ....)
will do the job. As it is a single SQL statement the performance will aslo be good.
See also
http://www.idug.org/p/bl/et/blogid=278&blogaid=422
Related
I have a requirement like this.
protected Integer[] updateFullTable(final Class clazz){
final ProjectionList projectionList=Projections.projectionList().add(Projections.property("id"),"id");
final Criteria criteria=session.createCriteria(clazz)
.add(Restrictions.eq("typeOfOperation",1))
.add(Restrictions.eq("performUpdate",true));
criteria.setProjection(projectionList);
final List idsList=criteria.list();
final Integer[]ids = transformObjectArrayIntoIntegerArray(idList);
//NOW WE UPDATE THE ROWS IDS.
final Query query=session.createQuery("update "+clazz.getName()+" set activeRegister=true and updateTime=:updateTime where id in (:ids)")
.setParameter("updateTime",new Date())
.setParameterList("ids",ids);
query.executeUpdate();
return transform;
}
As you guys can see I need to update all rows in a table sometime I query all the rows ids and later apply the update to those ids in a separate query but the tables has a lot of records sometimes takes between 30 seconds to 10 minutes depends of the table.
I have change this code to only one update like this.
final Query query=session.createQuery("update "+clazz.getName()+" set activeRegister=true and updateTime=:updateTime where typeOfOperation=1 and performUpdate=true");
And with that only query I avoid the first query but I cannot not longer return the affected Ids. But later the requirement was change a
final StringBuilder logRevert;
Parameter was added.
Which needs to store the updated ids to apply a direct reverse update into the DB if required.
But with my update i cannot get the Ids not longer. My question is how can I get or return the affected ids using a stored procedure or some workaround in the DB or hibernate I mean get the first behaviour with only one query or a enhanced code..
Any tip.
I have tried
Using criteria
Using HQL.
Using namedQuery
Using SqlQuery
Not using transformer returning me a raw Object[]
But the times still are somehow high.
I want something like
query.executeUpdate(); // RETURNS THE COUNT OF THE AFFECTED ROWS
But I need the affected Ids......
Sorry if the question is simple.
UPDATE
With #dmitry-senkovich I could do it using rawSQL but not with hibernate a separated question was made here.
https://stackoverflow.com/questions/44641851/java-hibernate-org-hibernate-exception-sqlgrammarexception-could-not-extract-re
What about the following solution?
SET #ids = NULL;
UPDATE SOME_TABLE
SET activeRegister = true, updateTime = :updateTime
WHERE typeOfOperation = 1 and performUpdate = true
AND (SELECT #ids := CONCAT_WS(',', id, #ids));
SELECT #ids;
if updateTime is datetime
you can select all affected record ids with select
Date updateTime = new Date(); // time from update
select id from clazz.getName() where updateTime=:updateTime and activeRegister=true and typeOfOperation=1 and performUpdate=true
Updating a large number of rows in a table is a slow operation. This is due to needing to capture the 'old' value of each row in case of a ROLLBACK (due to explicit ROLLBACK, failure of the UPDATE, failure or subsequent query in same transaction, or power failure before UPDATE finishes).
The usual fix is to rethink the application design that necessitated the large UPDATE.
On there other hand, there is a possible fix to the schema. Please provide SHOW CREATE TABLE so I don't have to do as much 'hand waving' in the following paragraph...
It might be better to move the column(s) that need to be updated into a separate, parallel, table ("vertical partitioning"). This might be beneficial if
The original table has lots of wide columns (TEXT, BLOB, etc) -- by not having to make bulky copies.
The original table is being updated simultaneously -- by the updates not blocking each other.
There are SELECT hitting the non-updated columns -- by avoiding certain other blockings.
You can still get the original set of columns -- by JOINing the two tables together.
I have an application that logs a lot of data to a MySQL database. The in-production version already runs insert statements in batches to improve performance. We're changing the db schema a bit so that some of the extraneous data is sent to a different table that we can join on lookup.
However, I'm trying to properly design the queries to work with our batch system. I wanted to use the mysql LAST_QUERY_ID so I wouldn't have to worry about getting the generated keys and matching them up (seems like a very difficult task).
However, I can't seem to find a way to add different insert statements to a batch, so how can resolve this? I assume I need to build a second batch and add all detail queries to that, but that means that the LAST_QUERY_ID loses meaning.
s = conn.prepareStatement("INSERT INTO mytable (stuff) VALUES (?)");
while (!queue.isEmpty()){
s.setLong(1, System.currentTimeMillis() / 1000L);
// ... set other data
s.addBatch();
// Add insert query for extra data if needed
if( a.getData() != null && !a.getData().isEmpty() ){
s = conn.prepareStatement("INSERT INTO mytable_details (stuff_id,morestuff)
VALUES (LAST_INSERT_ID(),?)");
s.setString(1, a.getData());
s.addBatch();
}
}
This is not how batching works. Batching only works within one Statement, and for a PreparedStatement that means that you can only add batches of parameters for one and the same statement. Your code also neglects to execute the statements.
For what you want to do, you should use setAutoCommit(false), execute both statement and then commit() (or rollback if an error occurred).
Also I'd suggest you look into the JDBC standard method of retrieving generated keys, as that will make your code less MySQL specific. See also Retrieving AUTO_INCREMENT Column Values through JDBC.
I've fixed it for now though I wish there was a better way. I built an arraylist of extra data values that I can associates with the generatedKeys returned from the batch inserts. After the first query batch executes, I build a second batch with the right ids/data.
I had to cleanup database ( few tables with given condition , where columns for conditions are always same ) e.g.
delete from table1 where date < given_date1 and id = given_id
delete from table2 where date < given_date2 and id = given_id
Where given_id and givendate relation varies on both table by table and id by id.
The actual delete condition is not always where date < givendate , I just wrote for example, so say one id has got 300 days of data, and other of 500 days of data, the where condition is allowed to delete oldes 10 days of data where 10 is a variable, based on user input, so at one iteration all nodes are processed with deleting oldest 10 days of data and thus query changes for each id, but the fact is that it would be on same sets of table
earlier that script was written in as sql script and doing its operation but was taking time, Now I have implemented a multithreaded java application where the new code looks like
for(i=0; i < idcount ; i++)
{
//launch new thread and against that thread call
delete(date,currentid);
}
function delete(date,id)
{
delete from table1 where date < given_date and id = given_id
delete from table2 where date < given_date and id = given_id
}
after implementing this I found deadlock on sql table, which was solved by indexing the tables, but still its not fast as it is supposed to be, If I have 500 threads they are all launched one after other, and obviously running on same sets of table. and sql is not actually executing in parallel on each table ?
When I monitor my java.exe and sqlserver.exe, its not busy at all ? I hope it is supposed to be.
Could anyone tell me what could be best approach to implement multithreaded delete on same sets of table, so that I can bump up the thread and do deletion in parallel and consume available resources
If all the actions are delete on a given id the I would just do a delete on each table doing all the ids at once.
e.g.
delete from table1 where date < given_date and id in (given_id1, given_id2 ..... )
If there are lots of given_ids the first insert them into a temporary table then execute each delete by joining the table to have deletions with the temporary table
Also if trying to use multiple threads then the improvement is really only expected if you act on a table in a thread so there will not be contention in the database.
Ignoring the problem you created...
Why not use the IN statement?
delete from table1 where date < given_date and id IN (id1, id2, id3, ...)
Update based on clarification:
Based on the explanation in the comments, my guess is that you don't have good indexes and every delete statement is resulting in a table scan. Each table scan locks the table and thus the database can only process one statement at a time. Index the date and id columns along with any other column used in the where clause of your delete statement.
In my personal experience, I make a class to manage my queries and the communication with the database. I use a thread pool to manage my threads and simply have the threads make calls to my static database manager. The manager should have a synchronized method in it that acquires a lock() on to the database connection. The threads will then be able to access the database and their actions won't conflict with each other.
If you dont care about making all command in one transaction unit so put the delete in its own transaction (small one).
I have a table with fields: name|...|start_date|end_date
My code now is:
select .... 'check for period intersection
insert .... 'if check succesfull insert new row
This code in one transaction.
When two users try to insert new record in the same time with same fields(and periods intersects) two records inserted.
But I want to avoid that inserting. First user must insert, other user must get conflict.
How can I do it ?
P.S. I use IBM DB2
Insert query which gets the data from select. In select the values selected will data that need to be inserted. The where clause can check for condition and should return null if check fails. So if I want to enter if id 5 is not in in table than
Insert into test1(val) select "test" from (select case when id = 5 then null else 5 end '1' from sysP where id =5) aa
This query will insert test in table test1 is id =5 is not there in sysP table
You could use an UK or select for update:
select .... 'check for period intersection FOR UPDATE WITH RS USE AND KEEP UPDATE LOCKS
Update:
Try locking the whole table before the select with:
LOCK TABLE TABLE_NAME IN EXCLUSIVE MODE
This way, the second transaction waits for the previous one to commit before select. The EXCLUSIVE MODE locks the select statements too, not only updates and inserts.
Update 2:
If "check for period intersection" uses only column from the same table as the one you're inserting into, then instead of select add a constraint check to your table. See http://publib.boulder.ibm.com/infocenter/db2luw/v9/index.jsp?topic=%2Fcom.ibm.db2.udb.admin.doc%2Fdoc%2Ft0004984.htm
Sounds like MERGE is exactly what you want, when combined with some error raising. I'm assuming you're using DB2 on Linux/Unix/Windows, but MERGE has been on the Mainframe DB2 since v9.1 as well.
MERGE INTO YOUR_TABLE YT
USING (
VALUES ('val1', 'val2', 'val3')
) MG(v1, v2, v3)
ON (TY.v1 = MG.v1)
WHEN MATCHED
SIGNAL SQLSTATE '70001'
SET MESSAGE_TEXT = 'Record already exists!'
WHEN NOT MATCHED THEN
INSERT(v1, v2, v3)
VALUES(MG.v1, MG.v2, MG.v3
ELSE IGNORE;
The USING clause can be used with provided values (like I have here), or it could be a sub-select. There are other examples on the Merge page on the Information Center that I linked above.
I have a couple instances of a J2EE app running in a single WebLogic cluster.
At some point, these apps do a MERGE to insert or update a record into the back-end Oracle database. The MERGE checks to see if a row with a specified primary key is there or not. If it's there, update. If not, insert.
Now suppose two app instances want to insert or update a row with primary key = 100. Suppose the row doesn't exist. During the "check" stage of merge, they both see that the rows not there, so both of them attempt to insert. Then I get a unique key constraint violation.
My question is this: Is there an atomic MERGE in Oracle? I'm looking for something that has a similar effect to INSERT ... FOR UPDATE in PL/SQL except that I can only execute SQL from my apps.
EDIT: I was unclear. I AM using the MERGE statement while this error still occurs. The thing is, only the "modifying" part is atomic, not the whole merge.
This is not a problem with MERGE as such. Rather the issue lies in your application. Consider this stored procedure:
create or replace procedure upsert_t23
( p_id in t23.id%type
, p_name in t23.name%type )
is
cursor c is
select null
from t23
where id = p_id;
dummy varchar2(1);
begin
open c;
fetch c into dummy;
if c%notfound then
insert into t23
values (p_id, p_name);
else
update t23
set name = p_name
where id = p_id;
end if;
end;
So, this is the PL/SQL equivalent of a MERGE on T23. What happens if two sessions call it simultaneously?
SSN1> exec upsert_t23(100, 'FOX IN SOCKS')
SSN2> exec upsert_t23(100, 'MR KNOX')
SSN1 gets there first, finds no matching record and inserts a record. SSN2 gets there second but before SSN1 commits, finds no record, inserts a record and hangs because SSN1 has a lock on the unique index node for 100. When SSN1 commits SSN2 will hurl a DUP_VAL_ON_INDEX violation.
The MERGE statement works in exactly the same way. Both sessions will check on (t23.id = 100), not find it and go down the INSERT branch. The first session will succeed and the second will hurl ORA-00001.
One way to handle this is to introduce pessimistic locking. At the start of the UPSERT_T23 procedure we lock the table:
...
lock table t23 in row shared mode nowait;
open c;
...
Now, SSN1 arrives, grabs the lock and proceeds as before. When SSN2 arrives it can't get the lock, so it fails immediately. Which is frustrating for the second user but at least they are not hanging, plus they know someone else is working on the same record.
There is no syntax for INSERT which is equivalent to SELECT ... FOR UPDATE, because there is nothing to select. And so there is no such syntax for MERGE either. What you need to do is include the LOCK TABLE statement in the program unit which issues the MERGE. Whether this is possible for you depends on the framework you're using.
The MERGE statement in the second session can not "see" the insert that the first session did until that session commits. If you reduce the size of the transactions the probability that this will occur will be reduced.
Or, can you sort or partition your data so that all records of a given primary key will be given to the same session. A simple function like "primary key mod N" should distribute evenly to N sessions.
btw, if two records have the same primary key, the second will overwrite the first. Sounds a little odd.
Yes, and it's called.... MERGE
EDIT: The only way to get this water tight is to insert, catch the dup_val_on_index exception and handle it appropriately (update, or insert other record perhaps). This can easily be done with PL/SQL, but you can't use that.
You're also looking for workarounds. Can you catch the dup_val_on_index in Java and issue an extra UPDATE again?
In pseudo-code:
try {
// MERGE
}
catch (dup_val_on_index) {
// UPDATE
}
I am surprised that MERGE would behave the way you describe, but I haven't used it sufficiently to say whether it should or not.
In any case, you might have the transactions that wish to execute the merge set their isolation level to SERIALIZABLE. I think that may solve your issue.