I've got a weird situation here. I've used triggers and sequences to implement auto-increment. I insert the data into my tables from my web app which uses Hibernate. I test the web app in my machine (Netbeans) as well as on my office network (the web app is also deployed on our server with Wildfly).
It has always worked fine, until I started getting exceptions due to the unique constraint (Primary key). Then I discovered that the problem was the sequence that generates values for the ids. Example, For my table xtable, its sequence's last_number is 78400, the max id in xtable is 78308, but the sequence's nextval is 78304. I have no idea how that happens because I created the sequence with the following:
CREATE SEQUENCE XTABLE_SEQUENCE INCREMENT BY 1 START WITH 1;
I tried the following to update the sequence and make its NEXTVAL greater than the max(id) in the table, but I'm still getting the same result after n inserts
declare
maxval number(10);
begin
select max(ID) into maxval from XTABLE;
maxval := maxval+1;
execute immediate 'DROP SEQUENCE XTABLE_SEQUENCE';
execute immediate 'CREATE SEQUENCE XTABLE_SEQUENCE START WITH '|| maxval+50 ||' INCREMENT BY 1';
end;
Here is the trigger statement:
create or replace TRIGGER xtable_sequence_tr
BEFORE INSERT ON xtable FOR EACH ROW
WHEN (NEW.id IS NULL)
BEGIN
SELECT xtable_sequence.NEXTVAL INTO :NEW.id FROM DUAL;
END;
Or what is the proper way to implement autoincrement in Oracle in order to avoid the issue I am facing? At some point, I start getting unique key constraint violation on the primary key due to the fact that (I don't know for what reason) the max id in the table happens to be greater than the sequence.nextval used in the trigger. What is causing that and how to fix it?
To be honest this post is quite confusing on its own.
You state that,
"For my table xtable, Its sequence's last_number is 78400, the max id in xtable is 78308, but the sequence's nextval is 78304."
What it tells me is by having sequence last number as 78400, there were 100 sequences that were cached in memory and that would have to be started at 78300. Once 100 sequences are cached they can only be used as long as server is not restarted and they change sequence last value to show 78400 in your case but it doesn't mean that is how many sequences have already been used that are just sequences which are cached in memory to be used by next insert, unless the database is restarted in that case you will lose those sequence numbers that were cached. BTW sequence cache is shared among different sessions.
"but the sequence's nextval didn't change" Again you are assuming it that Last Value of sequence is same as sequence.nextval it is not the case. When you query dba_sequences view and look at Last_NUMBER column it represent last value CACHED not the last value generated by sequence.nextval or used in table.
To be honest to resolve this shouldn't take much effort.
A. Verify every time you insert row you must use sequence instead of running with procedures or triggers and then coming back to sequences, don't mix and match. (Remember one draw back of using direct sequences in insert is the order is not guaranteed like there could be entries like 1, 2 ,3 for id and next could be 10 reason could be that server was restarted and you lost unused cached value for sequences, if you really always want order than don't use sequence instead use procedure or other means).
B. Instead of first querying max id in table and then dropping sequence and then recreating again.
Drop the sequence first then get max value from table and then create sequence from that point onward. This will save you from losing track of sequence which may have been used already by dirty transaction from other sessions which may have been committed right when you were doing query to find max id on table.... but it is still not safe.
To make sure better results I would just create the new sequence starting from value above one value shown by below query, which should be used right before dropping the sequence.
select LAST_NUMBER from dba_sequences where sequence_name='YOUR_SEQUENCE_NAME'
Basically what I am saying is to be safe create the new sequence with greater value than the one currently been cached.
I figured the condition in which i was getting that problem. The thing is, while i was loading tens of thousands of records, for example executing a file containing 250000 insert queries, someone whould try to insert records (Through my webapp) at the same time. So probably, the problem occured when two insert query where gonna be executed at the same time.
Related
I have got a table with auto increment primary key. This table is meant to store millions of records and I don't need to delete anything for now. The problem is, when new rows are getting inserted, because of some error, the auto increment key is leaving some gaps in the auto increment ids.. For example, after 5, the next id is 8, leaving the gap of 6 and 7. Result of this is when I count the rows, it results 28000, but the max id is 58000. What can be the reason? I am not deleting anything. And how can I fix this issue.
P.S. I am using insert ignore while inserting records so that it doesn't give error when I try to insert duplicate entry in unique column.
This is by design and will always happen.
Why?
Let's take 2 overlapping transaction that are doing INSERTs
Transaction 1 does an INSERT, gets the value (let's say 42), does more work
Transaction 2 does an INSERT, gets the value 43, does more work
Then
Transaction 1 fails. Rolls back. 42 stays unused
Transaction 2 completes with 43
If consecutive values were guaranteed, every transaction would have to happen one after the other. Not very scalable.
Also see Do Inserted Records Always Receive Contiguous Identity Values (SQL Server but same principle applies)
You can create a trigger to handle the auto increment as:
CREATE DEFINER=`root`#`localhost` TRIGGER `mytable_before_insert` BEFORE INSERT ON `mytable` FOR EACH ROW
BEGIN
SET NEW.id = (SELECT IFNULL(MAX(id), 0) + 1 FROM mytable);;
END
This is a problem in the InnoDB, the storage engine of MySQL.
It really isn't a problem as when you check the docs on “AUTO_INCREMENT Handling in InnoDB” it basically says InnoDB uses a special table to do the auto increments at startup
And the query it uses is something like
SELECT MAX(ai_col) FROM t FOR UPDATE;
This improves concurrency without really having an affect on your data.
To not have this use MyISAM instead of InnoDB as storage engine
Perhaps (I haven't tested this) a solution is to set innodb_autoinc_lock_mode to 0.
According to http://dev.mysql.com/doc/refman/5.7/en/innodb-auto-increment-handling.html this might make things a bit slower (if you perform inserts of multiple rows in a single query) but should remove gaps.
You can try insert like :
insert ignore into table select (select max(id)+1 from table), "value1", "value2" ;
This will try
insert new data with last unused id (not autoincrement)
if in unique fields duplicate entry found ignore it
else insert new data normally
( but this method not support to update fields if duplicate entry found )
I have an application that uses a h2 Database to store records of data. Each record is assigned a unique ID that I have used the auto-increment feature in h2 to do so. I want the lowest number to always be 1, or at least fill up the numbers that are not filled when a record has been deleted. What I mean is if there are 5 records numbered 1-5 and I delete the third record, I want the next record added to be numbered 3 instead of 6. How should I go about achieving this?
So far, I've tried
ALTER TABLE <table_name> ALTER COLUMN <id_column> RESTART WITH 1
Which doesn't have the intended effect that I wanted.
Edit: I'm an idiot, I wrote the SQL Query without actually executing it. The does indeed restart from 1, but throws an exception whenever the increment value is in a value that already exists. How should I fix this?
We have a "audit" table that we create lots of rows in. Our persistence layer queries the audit table sequence to create a new row in the audit table. With millions of rows being created daily the select statement to get the next value from the sequence is one of our top ten most executed queries. We would like to reduce the number of database roundtrips just to get the sequence next value (primary key) before inserting a new row in the audit table. We know you can't batch select statements from JDBC. Are there any common techniques for reducing database roundtrips to get a sequence next value?
Get a couple (e.g. 1000) of sequence values in advance by a single select:
select your_sequence.nextval
from dual
connect by level < 1000
cache the obtained sequences and use it for the next 1000 audit inserts.
Repeat this when you have run out of cached sequence values.
Skip the select statement for the sequence and generate the sequence value in the insert statement itself.
insert (ID,..) values (my_sequence.nextval,..)
No need for an extra select. If you need the sequence value get it by adding a returning clause.
insert (ID,..) values (my_sequence.nextval,..) returning ID into ..
Save some extra time by specifying a cache value for the sequence.
I suggest you change the "INCREMENT BY" option of the sequence and set it to a number like 100 (you have to decide what step size must be taken by your sequence, 100 is an example.)
then implement a class called SequenceGenerator, in this class you have a property that contains the nextValue, and every 100 times, calls the sequence.nextVal in order to keep the db sequence up to date.
in this way you will go to db every 100 inserts for the sequence nextVal
every time the application starts, you have to initialize the SequenceGenerator class with the sequence.nextVal.
the only downside of this approach is that if your application stops for any reason, you will loose some of the sequences values and there will be gaps in your ids. but it should not be a logical problem if you don't have anu business logic on the id values.
This article says:
Unlike identity, the next number for the column value will be retrieved from memory rather than from the disk – this makes Sequence significantly faster than Identity
Does it mean that ID comes from disk in case of identity? If yes, then which disk and how?
Using sequence, I can see in the log, an extra select query to DB while inserting a new record. But I didn't find that extra select query in the log in case of identity.
Then how sequence becomes faster than identity?
Strategy used by sequence:
Before inserting a new row, ask the database for the next sequence value, then insert this row with the returned sequence value as ID.
Strategy used by identity:
Insert a row without specifying a value for the ID. After inserting the row, ask the database for the last generated ID.
The number of queries is thus the same in both cases. But, Hibernate uses by default a strategy that is more efficient for the sequence generator. In fact, when it asks for the next sequence value, it keeps th 50 (that's the dafault, IIRC, and it's configurable) next values in memory, and uses these 50 next values for the next 50 inserts. Only after 50 inserts, it goes to the database to get the 50 next values. This tremendously reduces the number of needed SQL queries needed for automatic ID generation.
The identity strategy doesn't allow for such an optimization.
The IDENTITY generator will always require a database hit for fetching the primary key value without waiting for the flush to synchronize the current entity state transitions with the database.
So the IDENTITY generator doesn't play well with Hibernate write-behind first level cache strategy, therefore JDBC batching is disabled for the IDENTITY generator.
The sequence generator can benefit from database value preallocation and you can even employ a hi/lo optimization strategy.
In my opinion, the best generators are the pooled and pooled-lo sequence generators. These generators combine the batch-friendly sequence generator with a client-side value generation optimization that's compatible with other DB clients that may insert rows without knowing anything about our generation strategy.
Anyway, you should never choose the TABLE generator because it performs really bad.
Though I'm personally new to Hibernate, from what I can recall, using Identity basically means that Hibernate will check what is the next possible id value from your DB and keep a value for it.
For sequence, you basically tell Hibernate to generate the next value based on a particular sequence you provide it. So it has to actually calculate the next id by looking at the next possible id value. Hence, the extra query is fired.
maybe this will answer your question :
Unlike identity column values, which are generated when rows are
inserted, an application can obtain the next sequence number before
inserting the row by calling the NEXT VALUE FOR function. The sequence
number is allocated when NEXT VALUE FOR is called even if the number
is never inserted into a table. The NEXT VALUE FOR function can be
used as the default value for a column in a table definition. Use
sp_sequence_get_range to get a range of multiple sequence numbers at
once.
you can find the detail here
Identity doesnt need that extra select query because Identity is a table dependent and Sequence is independent from table, but because of this we can get sequence even before creating a row(when you do session.save(T entity), sequence is generated even before you commit the transaction).
sequence :
you create or update entities -> each time you save entity -> hibernate get next sequence value -> your program return the value after all process complete without exception or rollback -> you commit all transaction -> hibernate insert all complete entity
identity : when commit transaction, insert incomplete entity(must get it from identity column). so the INSERT command of sequence is definitely slower, but the advantages is if you cancel the insert the count doesn't increasing.
I have a couple instances of a J2EE app running in a single WebLogic cluster.
At some point, these apps do a MERGE to insert or update a record into the back-end Oracle database. The MERGE checks to see if a row with a specified primary key is there or not. If it's there, update. If not, insert.
Now suppose two app instances want to insert or update a row with primary key = 100. Suppose the row doesn't exist. During the "check" stage of merge, they both see that the rows not there, so both of them attempt to insert. Then I get a unique key constraint violation.
My question is this: Is there an atomic MERGE in Oracle? I'm looking for something that has a similar effect to INSERT ... FOR UPDATE in PL/SQL except that I can only execute SQL from my apps.
EDIT: I was unclear. I AM using the MERGE statement while this error still occurs. The thing is, only the "modifying" part is atomic, not the whole merge.
This is not a problem with MERGE as such. Rather the issue lies in your application. Consider this stored procedure:
create or replace procedure upsert_t23
( p_id in t23.id%type
, p_name in t23.name%type )
is
cursor c is
select null
from t23
where id = p_id;
dummy varchar2(1);
begin
open c;
fetch c into dummy;
if c%notfound then
insert into t23
values (p_id, p_name);
else
update t23
set name = p_name
where id = p_id;
end if;
end;
So, this is the PL/SQL equivalent of a MERGE on T23. What happens if two sessions call it simultaneously?
SSN1> exec upsert_t23(100, 'FOX IN SOCKS')
SSN2> exec upsert_t23(100, 'MR KNOX')
SSN1 gets there first, finds no matching record and inserts a record. SSN2 gets there second but before SSN1 commits, finds no record, inserts a record and hangs because SSN1 has a lock on the unique index node for 100. When SSN1 commits SSN2 will hurl a DUP_VAL_ON_INDEX violation.
The MERGE statement works in exactly the same way. Both sessions will check on (t23.id = 100), not find it and go down the INSERT branch. The first session will succeed and the second will hurl ORA-00001.
One way to handle this is to introduce pessimistic locking. At the start of the UPSERT_T23 procedure we lock the table:
...
lock table t23 in row shared mode nowait;
open c;
...
Now, SSN1 arrives, grabs the lock and proceeds as before. When SSN2 arrives it can't get the lock, so it fails immediately. Which is frustrating for the second user but at least they are not hanging, plus they know someone else is working on the same record.
There is no syntax for INSERT which is equivalent to SELECT ... FOR UPDATE, because there is nothing to select. And so there is no such syntax for MERGE either. What you need to do is include the LOCK TABLE statement in the program unit which issues the MERGE. Whether this is possible for you depends on the framework you're using.
The MERGE statement in the second session can not "see" the insert that the first session did until that session commits. If you reduce the size of the transactions the probability that this will occur will be reduced.
Or, can you sort or partition your data so that all records of a given primary key will be given to the same session. A simple function like "primary key mod N" should distribute evenly to N sessions.
btw, if two records have the same primary key, the second will overwrite the first. Sounds a little odd.
Yes, and it's called.... MERGE
EDIT: The only way to get this water tight is to insert, catch the dup_val_on_index exception and handle it appropriately (update, or insert other record perhaps). This can easily be done with PL/SQL, but you can't use that.
You're also looking for workarounds. Can you catch the dup_val_on_index in Java and issue an extra UPDATE again?
In pseudo-code:
try {
// MERGE
}
catch (dup_val_on_index) {
// UPDATE
}
I am surprised that MERGE would behave the way you describe, but I haven't used it sufficiently to say whether it should or not.
In any case, you might have the transactions that wish to execute the merge set their isolation level to SERIALIZABLE. I think that may solve your issue.