Example:
suppose that entity E has id generated by sequence e_seq
suppose that value of sequence is initially 0 on the database, and increment is configured to be 50
when hibernate starts, it gets the next value of the sequence (i.e. 0+50=50) and keeps an internal cache of the available values (i.e. those in the interval 0-50)
as long as the cache has available values, no further requests to the dbms are performed to get next value of sequence
only after you create 50 instances of entity E the 50 ids are consumed and hibernate asks the next value to the dbms.
suppose that the hibernate cache has still 50 ids available
suppose that a low-level procedure (like data migrations) inserts let's say 100 entities of type E in the database using SQL statements (not using hibernate APIs), with ids from 1 to 100 and then resets the sequence value to 100
if application tries to create a new entity from its APIs, it will use an id taken from the hibernate cache but which has already being used by the low-level procedure, hence causing a duplicate id exception
I need therefore to find a way to tell Hibernate to "reset its ids cache", or in other words "force hibernate to contact again the dbms to get the current sequence value".
a low-level procedure [...] inserts let's say 100 entities [...] with ids from 1 to 100
Why is that low-level procedure generating the IDs on its own? Why is it NOT using the sequence?
The whole point of Hibernate's pooled and pooled-lo ID generating mechanisms, which you appear to be using (and definitely should, if you're not), is to be able to safely cache IDs even on the face of any external processes making use of the sequence outside of Hibernate's control.
If that external process used the sequence too, your problem would disappear, since none of Hibernate's cached values would get used; and the next batch of cached values would start from whatever sequence value was last generated by the external process, avoiding conflicts:
Hibernate caches values 0-49. sequence.NEXTVAL would be 50.
External process inserts 100 rows. sequence.NEXTVAL would be 5050.
Hibernate ends up using all cached values, and asks for the next sequence value.
Hibernate caches values 5050-5099. sequence.NEXTVAL would be 5100.
Etc.
The solution to your issue, assuming you're using Hibernate's pooled(-lo) ID strategy, is not to disable or reset Hibernate's cache and hinder your application performance; the solution is to make any external processes use NEXTVAL() too to generate the appropriate IDs for the entities when inserting data into that table, instead of providing their own values.
Concerns:
"But then I would end up with gaps in my IDs!"
So what?
There's no problem whatsoever in your ID column having gaps. Your goal here is avoiding ID conflicts and ensuring that your application does not make 2 trips to DB (one for the sequence, one for the actual insert) every time you create an entity. If not having a neat, perfectly sequential set of IDs is the price to pay for that, so be it! Quite a deal, if you ask me ;)
"But then entities that were created later using Hibernate's cached values would have a lower ID than those created by the external process before!"
So what?
The primary goal of having an ID column is to be able to uniquely identify a row via a single value. Discerning order of creation should not be a factor in how you manage your ID values; a timestamp column is better suited for that.
"But the ID value would grow up too fast! I just inserted 50 rows and it's already by the thousands! I'll run out of numbers!"
Ok, legitimate concern here. But if you're using sequences, chances are you're using either Oracle or PostgreSQL, maybe SQL Server. Am I right?
Well, PostgreSQL's MAXVALUE for a bigint sequence is 9223372036854775807. Same goes for SQL Server. If your process inserted a new row each millisecond non-stop, it would still take it 5 million years to reach the end of the sequence. Oracle's MAXVALUE for a sequence is 999999999999999999999999999, several orders of magnitude greater than that.
So... As long as the datatype of your ID column and sequence is aptly chosen, you're safe on that regard.
Have you tried to clear the current session and create a new one?
This forces Hibernate to re-query the database for the current sequence value.
In other words you can use the method Session.flush() and Session.clear():
Session session = sessionFactory.openSession();
Transaction transaction = session.beginTransaction();
// Perform some operations that use the id cache
session.flush();
session.clear();
// Perform some more operations that use the id cache
transaction.commit();
session.close();
Or you could use EntityManager.refresh() which will refresh the state of the instance from the database, and in the process, update the internal cache with the current sequence value:
EntityManager em = entityManagerFactory.createEntityManager();
em.getTransaction().begin();
// Perform some operations that use the id cache
em.refresh(entity);
// Perform some more operations that use the id cache
em.getTransaction().commit();
em.close();
May this link will help
https://www.baeldung.com/hibernate-identifiers#3-sequence-generation
#Entity
public class User {
#Id
#GeneratedValue(generator = "sequence-generator")
#GenericGenerator(
name = "sequence-generator",
strategy = "org.hibernate.id.enhanced.SequenceStyleGenerator",
parameters = {
#Parameter(name = "sequence_name", value = "user_sequence"),
#Parameter(name = "initial_value", value = "4"),
#Parameter(name = "increment_size", value = "1")
}
)
private long userId;
// ...
}
Related
In my springboot application, I noticed one strange issue when inserting new rows.
My ids are generated by sequence, but after I restart the application it starts from 21.
Example:
First launch, I insert 3 rows - ids generated by sequence 1,2,3
After restart second launch, I insert 3 rows ids generated from 21. So ids are 21,22 ...
Every restart It increased to 20. - This increasing pattern always 20
Refer my database table (1,2 after restart 21)
My JPA entity
#Id
#GeneratedValue(strategy = GenerationType.IDENTITY)
#Column(unique = true, nullable = false)
private Long id;
I tried some stackoverflow solutions, it's not working
I tried this, not working
spring.jpa.properties.hibernate.id.new_generator_mappings=false
I want to insert rows by sequence like 1,2,3,4. Not like this 1,2,21,22, How to resolve this problem?
Although I think the question comments already provide all the information necessary to understand the problem, please, let me try explain some things and try fixing some inaccuracies.
According to your source code you are using the IDENTITY id generation strategy:
#Id
#GeneratedValue(strategy = GenerationType.IDENTITY)
#Column(unique = true, nullable = false)
private Long id;
You are using an Oracle database and this is a very relevant information for the question.
Support for IDENTITY columns was introduced in Oracle 12c, probably Release 1, and in Hibernate version - I would say 5.1 although here in SO is indicated that you need at least - 5.3.
Either way, IDENTITY columns in Oracle are supported by the use of database SEQUENCEs: i.e., for every IDENTITY column a corresponding sequence is created. As you can read in the Oracle documentation this explain why, among others, all the options for creating sequences can be applied to the IDENTITY column definition, like min and max ranges, cache size, etc.
By default a sequence in Oracle has a cache size of 20 as indicated in a tiny note in the aforementioned Oracle documentation:
Note: When you create an identity column, Oracle recommends that you
specify the CACHE clause with a value higher than the default of 20 to
enhance performance.
And this default cache size is the reason that explains why you are obtaining this non consecutive numbers in your id values.
This behavior is not exclusive to Hibernate: please, just issue a simple JDBC insert statement or SQL commands with any suitable tool and you will experiment the same.
To solve the issue create your table indicating NOCACHE for your IDENTITY column:
CREATE TABLE your_table (
id NUMBER GENERATED BY DEFAULT ON NULL AS IDENTITY NOCACHE,
--...
)
Note you need to use NOCACHE and not CACHE 0 as indicated in the question comments and now in a previous version of other answers, which is an error because the value for the CACHE option should be at least 2.
Probably you could modify your column without recreating the whole table as well:
ALTER TABLE your_table MODIFY (ID GENERATED BY DEFAULT ON NULL AS IDENTITY NOCACHE);
Having said all that, please, be aware that in fact the cache mechanism is an optimization and not a drawback: in the end, and this is just my opinion, those ids are only non natural assigned IDs and, in a general use case, the cache benefits outweigh the drawbacks.
Please, consider read this great article about IDENTITY columns in Oracle.
The provided answer related to the use of the hilo optimizer could be right but it requires explicitly using the optimizer in your id field declaration which seems not to be the case.
It is related to Hi/Lo algorithm that Hibernate uses for incrementing the sequence value. Read more in this example: https://www.baeldung.com/hi-lo-algorithm-hibernate.
This is an optimization used by Hibernate, which consumes some values from the DB sequence into a pool (Java runtime) and uses them while executing appropriate INSERT statements on the table. If this optimization is turned off and set allocationSize=1, then the desired behavior (no gaps in ids) is possible (with a certain precision, not always), but for the price of making two requests to DB for each INSERT.
Examples give the idea of what is going on in the upper level of abstraction.
(Internal implementation is more complex, but here we don't care)
Scenario: user makes 21 inserts during some period of time
Example 1 (current behavior allocationSize=20)
#1 insert: // first cycle
- need next MY_SEQ value, but MY_SEQ_PREFETCH_POOL is empty
- select 20 values from MY_SEQ into MY_SEQ_PREFETCH_POOL // call DB
- take it from MY_SEQ_PREFETCH_POOL >> remaining=20-1
- execute INSERT // call DB
#2-#20 insert:
- need next MY_SEQ value,
- take it from MY_SEQ_PREFETCH_POOL >> remaining=20-i
- execute INSERT // call DB
#21 insert: // new cycle
- need next MY_SEQ value, but MY_SEQ_PREFETCH_POOL is empty
- select 20 values from MY_SEQ into MY_SEQ_PREFETCH_POOL // call DB
- take it from MY_SEQ_PREFETCH_POOL >> remaining=19
- execute INSERT // call DB
Example 2 (current behavior allocationSize=1)
#1-21 insert:
- need next MY_SEQ value, but MY_SEQ_PREFETCH_POOL is empty
- select 1 value from MY_SEQ into MY_SEQ_PREFETCH_POOL // call DB
- take it from MY_SEQ_PREFETCH_POOL >> remaining=0
- execute INSERT // call DB
Example#1: total calls to DB is 23
Example#2: total calls to DB is 42
Manual declaration of the sequence in the database will not help in this case, because, for instance in this statement\
CREATE SEQUENCE ABC START WITH 1 INCREMENT BY 1 CYCLE NOCACHE;
we control only "cache" used in the DB internal runtime, which is not visible to Hibernate. It affects sequence gaps in situations when DB stopped and started again, and this is not the case.
When Hibernate consumes values from the sequence it implies that the state of the sequence is changed on DB side. We may treat it as hotel rooms booking: a company (Hibernate) booked 20 rooms for a conference in a hotel (DB), but only 2 participants arrived. Then 18 rooms will stay empty and cannot be used by other guests. In this case the "booking period" is forever.
More details on how to configure Hibernate work with sequences is here:
https://ntsim.uk/posts/how-to-use-hibernate-identifier-sequence-generators-properly
Here is a short answer for older version of Hibernate. Still it has relevant ideas:
https://stackoverflow.com/a/5346701/2774914
I have an entity with the following id configuration:
public class Publication implements Serializable, Identifiable {
#Id
#GeneratedValue(strategy = GenerationType.SEQUENCE, generator = "sequenceGenerator")
#SequenceGenerator(name = "sequenceGenerator")
private Long id;
}
with this generator (Liquibase syntax):
<createSequence incrementBy="10" sequenceName="sequence_generator" startValue="1" cacheSize="10"/>
and a Spring Data JPA Repository:
#Repository
public interface PublicationRepository extends JpaRepository<Publication, Long>, JpaSpecificationExecutor<Publication> {
// ...
}
Now I have part in my application where I create about 250 new Publication objects without an id and then do publicationRepository.saveAll(). I get the following exception:
Caused by: javax.persistence.EntityExistsException: A different object with the same identifier value was already associated with the session : [mypackage.Publication#144651]
I debugged with breakpoints and found that this always happens with the 50th object, where the assigned ID suddenly is set as an ID that is already present in the set of already saved objects – so the generator seems to return the wrong value. For collections with less than 50 objects, it works fine.
What is also strange: The objects IDs created have an increase of 1, while if if execute NEXT VALUE FOR sequence_generator on my database i get IDs in increments of 10.
Am I using the generator wrong?
You need to sync SequenceGenerator's allocationSize with your sequence's incrementBy. The default value for allocationSize is 50, which means that after every 50th insert, the hibernate will generate select nextval('sequence_name)` (or something similar depending on the dialect), to get the next starting value for IDs.
What happens in your case is that:
for the first insert Hibernate fetches next value for the sequence, which is 1. By first insert I mean first insert whenever the service/application is (re)started.
then it performs 50 inserts (default allocationSize) without asking DB what is the next value for the sequence. Generated ID will be from 1 to 50.
51st insert fetches next value for the sequence, which is 11 (startBy + incrementBy). Previously you inserted an entity with ID=11, which is why it fails to insert the new entity (PK constraint violation).
Also, every time you call select nextval on sequence, it simply does currentValue + incrementBy. For your sequence, it'll be 1, 11, 21, 31, etc.
If you enable SQL logs, you'll see following:
Calling repository.save(entity) the first time would generate
select nextval('sequence_name`);
insert into table_name(...) values (...);
Saving second entity with repository.save(entity) would generate only
insert into table_name(...) values (...);
After allocationSize number of inserts you would again see:
select nextval('sequence_name`);
insert into table_name(...) values (...);
Advantage of using sequences is to minimize the number of times Hibernate would need to talk to the DB to get the next ID. Depending on your use-case, you can adjust the allocationSize to get the best results.
Note: one of the comments suggested to use allocationSize = 1 which is very bad and will have a huge impact on performance. For Hibernate that would mean that it needs to issue select nextval every time it performs an insert. In other words, you'll have 2 SQL statements for every insert.
Note 2: Also keep in mind that you need to keep initialValue of SequenceGenerator and startValue of sequence in sync as well. allocationSize and initialValue are the two values used by the sequence generator to calculate the next value.
Note 3: It worth mentioning that depending on the algorithm used to generate sequences (hi-lo, pooled, pooled-lo, etc.), gaps may occur between service/application restarts.
Useful resources:
Hibernate pooled and pooled-lo identifier generators - in case you wish to change the algorithm used by the sequence generator to calculate the next value. There might be the case (e.g. in a concurrent environment) where the two service use the same DB sequence to generate values and their generated values might collide. For cases like that, one strategy is better that the other.
For business logic we need to update a record with the next value from a sequence defined directly in database and not in Hibernate (because is not always applied on insert / updates)
For this purpose we have a sequence defined in PostgreSQL witch DDL is:
CREATE SEQUENCE public.facturaproveedor_numeracionperiodofiscal_seq
INCREMENT 1 MINVALUE 1
MAXVALUE 9223372036854775807 START 1
CACHE 1;
Then in DAO, when some conditions are true, we get the nextVal via:
Query query = sesion.createSQLQuery("SELECT nextval('facturaproveedor_numeracionperiodofiscal_seq')");
Long siguiente = ((BigInteger) query.uniqueResult()).longValue();
But the values asigned aren't consecutive. Looking the Hibernate output log we see four fetchs to the sequence in the same transaction:
Hibernate: SELECT nextval('facturaproveedor_numeracionperiodofiscal_seq') as num
Hibernate: SELECT nextval('facturaproveedor_numeracionperiodofiscal_seq') as num
Hibernate: SELECT nextval('facturaproveedor_numeracionperiodofiscal_seq') as num
Hibernate: SELECT nextval('facturaproveedor_numeracionperiodofiscal_seq') as num
Why is this happening? Is this for catching purposes? There is a way to disable this? Or this workaround is not correct?
Hibernate won't usually generate standalone nextval calls that look like that, so I won't be too surprised if it's your application doing the multiple fetches. You'll need to collect more tracing information to be sure.
I think you may have a bigger problem though. If you care about sequences skipping values or leaving holes then you're using the wrong tool for the job, you should be using a counter in a table that you UPDATE, probably UPDATE my_id_generator SET id = id + 1 RETURNING id. This locks out concurrent transactions and also ensures that if the transaction rolls back, the update is undone.
Sequences by contrast operate in parallel, which means that it's impossible to roll back a sequence increment when a transaction rolls back (see the PostgreSQL documentation). So they're generally not suitable for accounting purposes like invoice numbering and other things where you require a gapless sequence.
For other readers who don't have the specific requirement to only sometimes generate a value: don't generate the sequence values manually; use a #GeneratedValue annotation.
In Hibernate 3.6 and newer, you should set hibernate.id.new_generator_mappings in your Hibernate properties.
Assuming you're mapping a generated key from a PostgreSQL SERIAL column, use the mapping:
#Id
#SequenceGenerator(name="mytable_id_seq",sequenceName="mytable_id_seq", allocationSize=1)
#GeneratedValue(strategy=GenerationType.SEQUENCE, generator="mytable_id_seq")
If you omit the allocationSize then Hibernate assumes it's bizarre default of 50 and fails to check that the sequence actually increments by 50 so it tries to insert already-used IDs and fails.
Hibernate/JPA isn't able to automatically create a value for your non-id-properties. The #GeneratedValue annotation is only used in conjunction with #Id to create auto-numbering
Hibernate caches query results.
Notice the "as num", looks like hibernate is altering the SQL as well.
You should be able to tell Hibernate not to cache results with
query.setCacheable(false);
I have a JPA entity with a TableGenerator with allocationSize=25. If I were to manually update the TableGenerator table and give it a new value for the next ID start range, it won't have an effect until the current range has passed.
For example, if the current TableGenerator table value is 10, I'd start to get entity IDs of 250, 251, 252, etc. At 255, I change the TableGenerater table value to 20. However, the next ID will still be 256, 257, all the way up to 274, then the next ID would be 500.
This is natural, of course - but I'm wondering, is there a way to tell Hibernate to, for this moment, ignore the current interval and start assigning IDs from whatever's in the TableGenerator table?
So, to answer the big why, in my particular case:
I'm working on a test automation tool for my team's product which, for one thing, is able to setup test data through a running system (using client applications, APIs, etc). The test data configurations (let's call them testdatas) are defined such that several testdatas may be used together for a particular test case/test suite.
Now, once a testdata has been run, the tool will extract the entered data from the db into SQL insert statements, and stow them away into files. There are several reasons for this, but mainly it's about performance - if I want to run 100 test cases with a certain testdata, I'd only really care about the testdata being inserted "manually" once, and then each time I reset, I may take the much faster route of inserting the test data directly into the database.
However, as I said, several testdatas may be used together. What if testdata01 and testdata02 both affect the same tables? The extracted SQL insert statements will not contain the data for only that particular testdata, if another one has already been run beforehand.
A simple solution to this is to reserve an interval of IDs for each testdata. testdata01 has interval [10000, 20000), testdata02 has interval [20000, 30000), etc, for each table. This was easy to implement - before running each testdata, simply update all TableGenerator tables to the lower bound of the testdata's ID interval - then, after running the testdata setup, extract only the rows with ID within the interval.
This works great, and makes sure that there are never any clashes in IDs between testdatas, and that the exported SQL for each testdata only contains data for that particular testdata, regardless of what else may be in the database at the time. However, this one thing where allocationSize is not 1 messes things up - entries may still appear outside of the reserved ID interval for the given entity, even though we've updated updated that entity's TableGenerator.
So, in short, what I'd like to do is, after updating the TableGenerator tables and before starting to run a testdata setup, I'd like to tell Hibernate that for each entity, next time you generate an ID, disregard whatever next value you would like to generate from that TableGenerator's range, and instead check the TableGenerator table in the database for which range to use next.
So I managed to scratch this one myself. For future reference if anyone should have the same need:
public void moveToNextInterval(Class entity, javax.persistence.EntityManager em) throws IllegalAccessException, InstantiationException {
javax.persistence.TableGenerator tableGenerator = null;
for (Method method : entity.getMethods()) {
tableGenerator = method.getAnnotation(javax.persistence.TableGenerator.class);
if (tableGenerator != null) {
break;
}
}
if (tableGenerator != null && tableGenerator.allocationSize() > 1) {
int allocationSize = tableGenerator.allocationSize();
org.hibernate.impl.SessionImpl session = (org.hibernate.impl.SessionImpl) em.unwrap(org.hibernate.Session.class);
IdentifierGenerator idGenerator = session.getFactory().getIdentifierGenerator(entity.getName());
while ((Long)idGenerator.generate(session, entity.newInstance()) % allocationSize != allocationSize - 1);
}
}
Not too used to Hibernate or JPA yet, so there are probably lots of possible improvements. It shouldn't be too difficult to generalize this for any kind of sequence generator. Also, you might just have to create one instance of entity and reuse it. In addition, I guess there's the risk of overshooting the intended ID; for example, if allocationSize=25 and the last ID was 24, and we've updated the TableGenerator table and set it to 10, then calling this method will in effect make it so that the next ID will be 275 instead of 250. Good enough for my purposes, but good to know.
I need for a particular business scenario to set a field on an entity (not the PK) a number from a sequence (the sequence has to be a number between min and max
I defined the sequence like this :
CREATE SEQUENCE MySequence
MINVALUE 65536
MAXVALUE 4294967296
START WITH 65536
INCREMENT BY 1
CYCLE
NOCACHE
ORDER;
In Java code I retrieve the number from the sequence like this :
select mySequence.nextval from dual
My question is :
If I call this "select mySequence.nextval from dual" in a transaction and in the same time in another transaction same method is called (parallel requests) it is sure that the values returned by the sequence are different ?
Is not possible to have like read the uncommitted value from the first transaction ?
Cause let's say I would have not used sequence and a plain table where I would increment myself the sequence, then the transaction 2 would have been able to read same value if the trasactinalitY was the default "READ COMMITTED".
The answer is NO.
Oracle guarantees that numbers generated by sequence are different. Even if parallel requests are issued, RAC environment or rollback and commits are mixed.
Sequences have nothing to do with transactions.
See here the docs:
Use the CREATE SEQUENCE statement to create a sequence, which is a
database object from which multiple users may generate unique
integers. You can use sequences to automatically generate primary key
values.
When a sequence number is generated, the sequence is incremented,
independent of the transaction committing or rolling back. If two
users concurrently increment the same sequence, then the sequence
numbers each user acquires may have gaps, because sequence numbers are
being generated by the other user. One user can never acquire the
sequence number generated by another user. After a sequence value is
generated by one user, that user can continue to access that value
regardless of whether the sequence is incremented by another user.
Sequence numbers are generated independently of tables, so the same
sequence can be used for one or for multiple tables. It is possible
that individual sequence numbers will appear to be skipped, because
they were generated and used in a transaction that ultimately rolled
back. Additionally, a single user may not realize that other users are
drawing from the same sequence.
Oracle guarantees sequence numbers will be different. Even if your transaction is rolled back, the sequence is 'used' and not reissued to another query.
Edit: Adding additional information after requirements around "no gaps" were stated in comments by Cris
If your requirements are for a sequence of numbers without gaps then oracle sequences will probably not be a suitable solution, as there will be gaps when transactions roll back, or when the database restarts or any other number of scenarios.
Sequences are primarily intended as a high performance generation tool for unique numbers (e.g. primary keys) without regard to gaps and transaction context constraints.
If your design / business / audit requirements need to account for every number then you would need instead to design a solution that uses a predetermined number within the transaction context. This can be tricky and prone to performance / locking issues in a multi-threaded environment. It would be better to try to redefine your requirement so that gaps don't matter.
sequence.nextval never returns the same value (before cycled) for the concurrent request. Perhaps you should check the following URL:
http://docs.oracle.com/cd/B19306_01/server.102/b14220/schema.htm#sthref883
Unfortunately you have to implement you're 'own wheel' - transactional sequence. It is rather simple - just create the table like sequence_name varchar2, value, min_value number, max_value number, need_cycle char and mess around 'select value into variable from your sequence table for update wait (or nowait - it depends from your scenario)'. After it issue update set value = variable from previous step + 1 where sequence_name = the name of your sequence and issue the commit statement from client side. That's it.