I created a PostgreSQL sequence on a PostgreSQL 10.7 dB called markush_seq
I read from the seq
select nextval('markush_seq’)` )
using a java web service:
When I run the web service on eclipse (using java 1.8.161) or call the sequence direct from SQL developer, it works fine and the sequence increments by 1 each time eg:
http://localhost:8086/wipdbws/read-markush-seq
21767823690
21767823691
21767823692
However when I run the webservice on AWS (which uses java 1.8.252) and read from the seq using:
https://aws-location/wipdbws/read-markush-seq
I get the sequence number returned as eg:
21767823692
21767823702
21767823693
21767823703
21767823694
21767823704
The sequence in AWS appears to be a combination of 2 incrementing sequences, 10 apart.
It’s the same java code, the only thing that has changed is:
The location of the webservice
a. AWS – USWEST
b. Eclipse - London
The java version:
a. 1.8.161 in London
b. 1.8.252 in US WEST
The seq details are:
SELECT * FROM information_schema.sequences
where sequence_name='markush_seq';
select * from pg_sequences where sequencename='markush_seq';
Any suggestion appreciated.
Likely due to multiple sessions accessing the sequence and sequence cache settings.
Documentation says:
although multiple sessions are guaranteed to allocate distinct
sequence values, the values might be generated out of sequence when
all the sessions are considered. For example, with a cache setting of
10, session A might reserve values 1..10 and return nextval=1, then
session B might reserve values 11..20 and return nextval=11 before
session A has generated nextval=2. Thus, with a cache setting of one
it is safe to assume that nextval values are generated sequentially;
with a cache setting greater than one you should only assume that the
nextval values are all distinct, not that they are generated purely
sequentially. Also, last_value will reflect the latest value reserved
by any session, whether or not it has yet been returned by nextval.
Related
In my springboot application, I noticed one strange issue when inserting new rows.
My ids are generated by sequence, but after I restart the application it starts from 21.
Example:
First launch, I insert 3 rows - ids generated by sequence 1,2,3
After restart second launch, I insert 3 rows ids generated from 21. So ids are 21,22 ...
Every restart It increased to 20. - This increasing pattern always 20
Refer my database table (1,2 after restart 21)
My JPA entity
#Id
#GeneratedValue(strategy = GenerationType.IDENTITY)
#Column(unique = true, nullable = false)
private Long id;
I tried some stackoverflow solutions, it's not working
I tried this, not working
spring.jpa.properties.hibernate.id.new_generator_mappings=false
I want to insert rows by sequence like 1,2,3,4. Not like this 1,2,21,22, How to resolve this problem?
Although I think the question comments already provide all the information necessary to understand the problem, please, let me try explain some things and try fixing some inaccuracies.
According to your source code you are using the IDENTITY id generation strategy:
#Id
#GeneratedValue(strategy = GenerationType.IDENTITY)
#Column(unique = true, nullable = false)
private Long id;
You are using an Oracle database and this is a very relevant information for the question.
Support for IDENTITY columns was introduced in Oracle 12c, probably Release 1, and in Hibernate version - I would say 5.1 although here in SO is indicated that you need at least - 5.3.
Either way, IDENTITY columns in Oracle are supported by the use of database SEQUENCEs: i.e., for every IDENTITY column a corresponding sequence is created. As you can read in the Oracle documentation this explain why, among others, all the options for creating sequences can be applied to the IDENTITY column definition, like min and max ranges, cache size, etc.
By default a sequence in Oracle has a cache size of 20 as indicated in a tiny note in the aforementioned Oracle documentation:
Note: When you create an identity column, Oracle recommends that you
specify the CACHE clause with a value higher than the default of 20 to
enhance performance.
And this default cache size is the reason that explains why you are obtaining this non consecutive numbers in your id values.
This behavior is not exclusive to Hibernate: please, just issue a simple JDBC insert statement or SQL commands with any suitable tool and you will experiment the same.
To solve the issue create your table indicating NOCACHE for your IDENTITY column:
CREATE TABLE your_table (
id NUMBER GENERATED BY DEFAULT ON NULL AS IDENTITY NOCACHE,
--...
)
Note you need to use NOCACHE and not CACHE 0 as indicated in the question comments and now in a previous version of other answers, which is an error because the value for the CACHE option should be at least 2.
Probably you could modify your column without recreating the whole table as well:
ALTER TABLE your_table MODIFY (ID GENERATED BY DEFAULT ON NULL AS IDENTITY NOCACHE);
Having said all that, please, be aware that in fact the cache mechanism is an optimization and not a drawback: in the end, and this is just my opinion, those ids are only non natural assigned IDs and, in a general use case, the cache benefits outweigh the drawbacks.
Please, consider read this great article about IDENTITY columns in Oracle.
The provided answer related to the use of the hilo optimizer could be right but it requires explicitly using the optimizer in your id field declaration which seems not to be the case.
It is related to Hi/Lo algorithm that Hibernate uses for incrementing the sequence value. Read more in this example: https://www.baeldung.com/hi-lo-algorithm-hibernate.
This is an optimization used by Hibernate, which consumes some values from the DB sequence into a pool (Java runtime) and uses them while executing appropriate INSERT statements on the table. If this optimization is turned off and set allocationSize=1, then the desired behavior (no gaps in ids) is possible (with a certain precision, not always), but for the price of making two requests to DB for each INSERT.
Examples give the idea of what is going on in the upper level of abstraction.
(Internal implementation is more complex, but here we don't care)
Scenario: user makes 21 inserts during some period of time
Example 1 (current behavior allocationSize=20)
#1 insert: // first cycle
- need next MY_SEQ value, but MY_SEQ_PREFETCH_POOL is empty
- select 20 values from MY_SEQ into MY_SEQ_PREFETCH_POOL // call DB
- take it from MY_SEQ_PREFETCH_POOL >> remaining=20-1
- execute INSERT // call DB
#2-#20 insert:
- need next MY_SEQ value,
- take it from MY_SEQ_PREFETCH_POOL >> remaining=20-i
- execute INSERT // call DB
#21 insert: // new cycle
- need next MY_SEQ value, but MY_SEQ_PREFETCH_POOL is empty
- select 20 values from MY_SEQ into MY_SEQ_PREFETCH_POOL // call DB
- take it from MY_SEQ_PREFETCH_POOL >> remaining=19
- execute INSERT // call DB
Example 2 (current behavior allocationSize=1)
#1-21 insert:
- need next MY_SEQ value, but MY_SEQ_PREFETCH_POOL is empty
- select 1 value from MY_SEQ into MY_SEQ_PREFETCH_POOL // call DB
- take it from MY_SEQ_PREFETCH_POOL >> remaining=0
- execute INSERT // call DB
Example#1: total calls to DB is 23
Example#2: total calls to DB is 42
Manual declaration of the sequence in the database will not help in this case, because, for instance in this statement\
CREATE SEQUENCE ABC START WITH 1 INCREMENT BY 1 CYCLE NOCACHE;
we control only "cache" used in the DB internal runtime, which is not visible to Hibernate. It affects sequence gaps in situations when DB stopped and started again, and this is not the case.
When Hibernate consumes values from the sequence it implies that the state of the sequence is changed on DB side. We may treat it as hotel rooms booking: a company (Hibernate) booked 20 rooms for a conference in a hotel (DB), but only 2 participants arrived. Then 18 rooms will stay empty and cannot be used by other guests. In this case the "booking period" is forever.
More details on how to configure Hibernate work with sequences is here:
https://ntsim.uk/posts/how-to-use-hibernate-identifier-sequence-generators-properly
Here is a short answer for older version of Hibernate. Still it has relevant ideas:
https://stackoverflow.com/a/5346701/2774914
We have an use-case where we keep a monotonically increasing key signifying the current version of a customer data. It is used in the system to figure out the most recent version and resolve the conflict for third party callers if there is delay between reading the data and its processing and they get multiple versions.
CUSTOMER_RESOURCE_ID CURRENT_VERSION
132323 1234
If something changes for this resource, we increment the version from 1234 to 1235 (it is totally fine even if we increase it to 1300 as long as it doesn't go down). For this, we need to first read the value and then update it.
Other alternative is to use DB's timestamp and keep updating version with the DB timestamp which will always be increasing. Since this is just one system, clock-skew can only happen when we change the DB. Also, we are not super concerned about the the case when multiple threads update the data within a fraction of time (i.e. least granularity of the timestamp) as we have another lock as per which only one thread updates the resource at one time.
I was wondering if we could use database's system timestamp to avoid the select-and-increment with just-update.
Is there any concern with this approach? I assume that it will be less overhead on the database but I don't know how much we save here.
Many approaches can be discussed. In my opinion;
-- You can use incremental squence that even default column value instead of database timestamp. Maybe miliseconds can make problem for you.
--In addition, If your table is not very big and your resources is enough, you can add a is_last_version column which will update with every insert to query last version of customers in a perfomance(ie. select * from customers where customer_id = 123 and is_last_version= 1--to get rid of the cost of ordering). So, you can know which row is the last version. But each insert time will be longer. You should test it.
Example:
CUSTOMER_RESOURCE:
CUSTOMER_RESOURCE_ID CURRENT_VERSION_ID IS_LAST_VERSION INSERT_TIME
(default seq) (default sysdate --optionally)
132323 1 1
132324 2 1
132325 3 1
132326 4 0
132327 5 0
132328 6 1
132326 7 0
132329 8 1
132326 9 1
132327 10 1
When insert:
update CUSTOMER_RESOURCE
SET IS_LAST_VERSION = 0
where CUSTOMER_RESOURCE_ID = 132326;
insert into CUSTOMER_RESOURCE(CUSTOMER_RESOURCE_ID,IS_LAST_VERSION) VALUES (132326,1);
COMMIT;
When select Last Version of a customer:
select * from customers where customer_id = 123 and is_last_version= 1;
We use batch statements when inserting as follows:
BatchBindStep batch = create.batch(create
.insertInto(PERSON, ID, NAME)
.values((Integer) null, null));
for (Person p : peopleToInsert) {
batch.bind(p.getId(), p.getName());
}
batch.execute();
This has worked well in the past when inserting several thousands of objects. However, it raises a few questions:
Is there an upper limit to the number of .bind() calls for a batch?
If so, what does the limit depend on?
It seems to be possible to call .bind() again after having executed .execute(). Will .execute() clear previously bound values?
To clarify the last question: after the following code has executed...
BatchBindStep batch = create.batch(create
.insertInto(PERSON, ID, NAME)
.values((Integer) null, null));
batch.bind(1, "A");
batch.bind(2, "B");
batch.extecute();
batch.bind(3, "C");
batch.bind(4, "D");
batch.execute();
which result should I expect?
a) b)
ID NAME ID NAME
------- -------
1 A 1 A
2 B 2 B
3 C 1 A
4 D 2 B
3 C
4 D
Unfortunately, neither the Javadoc nor the documentation discuss this particular usage pattern.
(I am asking this particular question because if I .execute() every 1000 binds or so to avoid said limit, I need to know whether I can reuse the batch objects for several .execute() calls or not.)
This answer is valid as of jOOQ 3.7
Is there an upper limit to the number of .bind() calls for a batch?
Not in jOOQ, but your JDBC driver / database server might have such limits.
If so, what does the limit depend on?
Several things:
jOOQ keeps an intermediate buffer for all of the bound variables and binds them to a JDBC batch statement all at once. So, your client memory might also impose an upper limit. But jOOQ doesn't have any limits per se.
Your JDBC driver might know such limits (see also this article on how jOOQ handles limits in non-batch statements). Known limits are:
SQLite: 999 bind variables per statement
Ingres 10.1.0: 1024 bind variables per statement
Sybase ASE 15.5: 2000 bind variables per statement
SQL Server 2008: 2100 bind variables per statement
I'm not aware of any such limits in Oracle, but there probably are.
Batch size is not the only thing you should tune when inserting large amounts of data. There are also:
Bulk size, i.e. the number of rows inserted per statement
Batch size, i.e. the number of statements per batch sent to the server
Commit size, i.e. the number of batches committed in a single transaction
Tuning your insertion boils down to tuning all of the above. jOOQ ships with a dedicated importing API where you can tune all of the above: http://www.jooq.org/doc/latest/manual/sql-execution/importing
You should also consider bypassing SQL for insertions into a loader table, e.g. using Oracle's SQL*Loader. Once you've inserted all data, you can move it to the "real table" using PL/SQL's FORALL statement, which is PL/SQL's version of JDBC's batch statement. This approach will out perform anything you do with JDBC.
It seems to be possible to call .bind() again after having executed .execute(). Will .execute() clear previously bound values?
Currently, execute() will not clear the bind values. You'll need to create a new statement instead. This is unlikely to change, as future jOOQ versions will favour immutability in its API design.
if say there is a Java EE application sitting on this Oracle database, and due to concurrency concern(may have more than one instance working on the same db) we have to keep oracle sequence increment by 1. However there are situation to insert 5000+ records into the same table, I wonder what could be the performance difference between:
jdbctemplate.query(INSERT INTO DATA_TABLE(ID, DATA...) VALUES(global_seq.nextval, data...));
-- for each insertion
and
Let Java to allocate 100 (same to sequence increment by 100) so that
while(some_jpa_seq_variable<100){
jdbctemplate.query("INSERT INTO DATA_TABLE(ID, DATA...) VALUES("+ ++some_jpa_seq_variable +", "+data+"...)");
}
In the first case, Java EE application will rely on Oracle to generate the next id by sequence, but increment by 1(defined in global_seq) and the second case is let Java to roll the sequence on behalf of Oracle
I need for a particular business scenario to set a field on an entity (not the PK) a number from a sequence (the sequence has to be a number between min and max
I defined the sequence like this :
CREATE SEQUENCE MySequence
MINVALUE 65536
MAXVALUE 4294967296
START WITH 65536
INCREMENT BY 1
CYCLE
NOCACHE
ORDER;
In Java code I retrieve the number from the sequence like this :
select mySequence.nextval from dual
My question is :
If I call this "select mySequence.nextval from dual" in a transaction and in the same time in another transaction same method is called (parallel requests) it is sure that the values returned by the sequence are different ?
Is not possible to have like read the uncommitted value from the first transaction ?
Cause let's say I would have not used sequence and a plain table where I would increment myself the sequence, then the transaction 2 would have been able to read same value if the trasactinalitY was the default "READ COMMITTED".
The answer is NO.
Oracle guarantees that numbers generated by sequence are different. Even if parallel requests are issued, RAC environment or rollback and commits are mixed.
Sequences have nothing to do with transactions.
See here the docs:
Use the CREATE SEQUENCE statement to create a sequence, which is a
database object from which multiple users may generate unique
integers. You can use sequences to automatically generate primary key
values.
When a sequence number is generated, the sequence is incremented,
independent of the transaction committing or rolling back. If two
users concurrently increment the same sequence, then the sequence
numbers each user acquires may have gaps, because sequence numbers are
being generated by the other user. One user can never acquire the
sequence number generated by another user. After a sequence value is
generated by one user, that user can continue to access that value
regardless of whether the sequence is incremented by another user.
Sequence numbers are generated independently of tables, so the same
sequence can be used for one or for multiple tables. It is possible
that individual sequence numbers will appear to be skipped, because
they were generated and used in a transaction that ultimately rolled
back. Additionally, a single user may not realize that other users are
drawing from the same sequence.
Oracle guarantees sequence numbers will be different. Even if your transaction is rolled back, the sequence is 'used' and not reissued to another query.
Edit: Adding additional information after requirements around "no gaps" were stated in comments by Cris
If your requirements are for a sequence of numbers without gaps then oracle sequences will probably not be a suitable solution, as there will be gaps when transactions roll back, or when the database restarts or any other number of scenarios.
Sequences are primarily intended as a high performance generation tool for unique numbers (e.g. primary keys) without regard to gaps and transaction context constraints.
If your design / business / audit requirements need to account for every number then you would need instead to design a solution that uses a predetermined number within the transaction context. This can be tricky and prone to performance / locking issues in a multi-threaded environment. It would be better to try to redefine your requirement so that gaps don't matter.
sequence.nextval never returns the same value (before cycled) for the concurrent request. Perhaps you should check the following URL:
http://docs.oracle.com/cd/B19306_01/server.102/b14220/schema.htm#sthref883
Unfortunately you have to implement you're 'own wheel' - transactional sequence. It is rather simple - just create the table like sequence_name varchar2, value, min_value number, max_value number, need_cycle char and mess around 'select value into variable from your sequence table for update wait (or nowait - it depends from your scenario)'. After it issue update set value = variable from previous step + 1 where sequence_name = the name of your sequence and issue the commit statement from client side. That's it.