Hibernate GenerationType.IDENTITY vs GenerationType.SEQUENCE - java

I'm looking for clarification on this question: #GeneratedValue(strategy="IDENTITY") vs. #GeneratedValue(strategy="SEQUENCE") (nearly a decade old has anything changed?)
Getting started with learning jpa, the following generation types both seem to auto-increment the primary keys that get generated by 1.
Identity:
#Id
#GeneratedValue(strategy = GenerationType.IDENTITY)
private Long id;
Sequence:
#Id
#SequenceGenerator(
name = "my_sequence",
sequenceName = "my_sequence",
allocationSize = 1
)
#GeneratedValue(
strategy = GenerationType.SEQUENCE,
generator = "my_sequence"
)
private Long id;
I understand that the sequence object created by GenerationType.SEQUENCE is not tied directly to a table in the way GenerationType.IDENTITY is. The old question mentions that sequence can be more flexible. Are there any objective pros or cons to choosing one of these strategies over the other? Also anything new considering the age of the question being referenced?

As it's stated in the hibernate documentation:
It is important to realize that using IDENTITY columns imposes a runtime behavior where the entity row must be physically inserted prior to the identifier value being known.
This can mess up extended persistence contexts (long conversations). Because of the runtime imposition/inconsistency, Hibernate suggests other forms of identifier value generation be used (e.g. SEQUENCE).
There is yet another important runtime impact of choosing IDENTITY generation: Hibernate will not be able to batch INSERT statements for the entities using the IDENTITY generation.
The importance of this depends on the application-specific use cases. If the application is not usually creating many new instances of a given entity type using the IDENTITY generator, then this limitation will be less important since batching would not have been very helpful anyway.

Related

UUID primary key for JPA Entity: safe approach to use unique values on multiple instances

I'm using SpringBoot, JPA and Hibernate.
I have a doubt.
For my entities I need to have an UUID as primary key (and I would like to save this id in "clear mode" (string) and not binary)
I'm using this code:
#Id
#Column(name = "id")
#Type(type = "uuid-char")
private UUID uuid = UUID.randomUUID();
My doubt is this: Is this approach safe? Can I be sure ids will always be unique?
I understand that, using this code, the UUID will be generated code side so, what's happen if I will have multiple instances for my service using the same DB service for all instances?
Is it possible that more instances will generate the same UUID?
UUID uuid = UUID.randomUUID()
My doubt is this: Is this approach safe? Can I be sure ids will always be unique?
Yes, extremely safe.
A UUID Version 4 has 122 bits of randomly generated data. That is a vast range of numbers. Assuming your UUID is being generated with a cryptographically-strong random number generator, you have no practical concerns with using a randomly-generated UUID.
For details, see the Collisions section on Wikipedia.
If you want to worry, apply your worry to things that are much more likely to happen. Top in my mind: Cosmic rays flipping bits in non-EEC memory. (See valid rant by Linus Torvalds on the issue.)
Personally, I consider the point-in-space-and-time versions such as Version 1 to be even less of a concern for collisions. But others debate this. Either way, Version 1 or Version 4, I would sleep well.
Despite saying the above, you should still ensure that your code is written to be robust in the face of collisions. Not because of collisions from randomly-generated duplicates, but because of collision from the all-too-human possibilities such as a bug in your code that double-posts the record to database, or a DBA who mistakenly loads back-up data twice, and so on.
I'm not a hibernate specialist, however, if you do it like this, in general, it should be kind of ok, which means that the probability of collision is low, however "what if" there is a collision? In other words, you can't be 100% sure that you're "guaranteed to avoid collision". If your system can deal with collision (re-create a UUID for example, assuming that performance penalty is negligible because this will happen in extremely rare cases) then no problem of course.
Now the real question is whether there is something else that can be done here?
Well, Hibernate is able to generate the UUID in a way that also uses an IP of the machine as a parameter of generation:
#Entity
public class SampleEntity {
#Id
#GeneratedValue(generator = "UUID")
#GenericGenerator(
name = "UUID",
strategy = "org.hibernate.id.UUIDGenerator",
parameters = {
#Parameter(
name = "uuid_gen_strategy_class",
value = "org.hibernate.id.uuid.CustomVersionOneStrategy"
)
}
)
#Column(name = "id", updatable = false, nullable = false)
private UUID id;
…
}
For more information read here for example
Of course, the best approach would be to let the DB deal with the ID generation. You haven't specified which database do you use. For example, Postgresql allows generating UUID keys with the help of extension:
Read here for example.
In general, using the UUID is not always a good idea - it's hard to deal with them in day-to-day life, and in general they introduce an overhead that might be significant if there are many rows in the table. So you might consider using an auto-increment sequence or something for the primary key - DB will be able to do it and you won't need to bother.

Exception : The field can't be empty (null) [duplicate]

#Id
#GeneratedValue(strategy = GenerationType.IDENTITY)
Why we are using this annotations?
i need to know if this autoincrement my table id values.
(GenerationType.IDENTITY) is there any other types whats actually happening when we use this annotation
public class Author extends Domain
{
#Id
#GeneratedValue(strategy = GenerationType.IDENTITY)
#Basic(optional = false)
#Column(name = "id")
private Integer id;
#Basic(optional = false)
#Column(name = "name")
private String name;
#Column(name = "address")
private String address;
#OneToMany(cascade = CascadeType.ALL, mappedBy = "authorId")
private List<Book>
bookList;
public Author()
{
setServiceClassName("wawo.tutorial.service.admin.AuthorService");
}
}
*Is it necessary to extend Domain abstract class?What is the use?
First of all, using annotations as our configure method is just a convenient method instead of coping the endless XML configuration file.
The #Idannotation is inherited from javax.persistence.Id, indicating the member field below is the primary key of current entity. Hence your Hibernate and spring framework as well as you can do some reflect works based on this annotation. for details please check javadoc for Id
The #GeneratedValue annotation is to configure the way of increment of the specified column(field). For example when using Mysql, you may specify auto_increment in the definition of table to make it self-incremental, and then use
#GeneratedValue(strategy = GenerationType.IDENTITY)
in the Java code to denote that you also acknowledged to use this database server side strategy. Also, you may change the value in this annotation to fit different requirements.
1. Define Sequence in database
For instance, Oracle has to use sequence as increment method, say we create a sequence in Oracle:
create sequence oracle_seq;
2. Refer the database sequence
Now that we have the sequence in database, but we need to establish the relation between Java and DB, by using #SequenceGenerator:
#SequenceGenerator(name="seq",sequenceName="oracle_seq")
sequenceName is the real name of a sequence in Oracle, name is what you want to call it in Java. You need to specify sequenceName if it is different from name, otherwise just use name. I usually ignore sequenceName to save my time.
3. Use sequence in Java
Finally, it is time to make use this sequence in Java. Just add #GeneratedValue:
#GeneratedValue(strategy=GenerationType.SEQUENCE, generator="seq")
The generator field refers to which sequence generator you want to use. Notice it is not the real sequence name in DB, but the name you specified in name field of SequenceGenerator.
4. Complete
So the complete version should be like this:
public class MyTable
{
#Id
#SequenceGenerator(name="seq",sequenceName="oracle_seq")
#GeneratedValue(strategy=GenerationType.SEQUENCE, generator="seq")
private Integer pid;
}
Now start using these annotations to make your JavaWeb development easier.
In a Object Relational Mapping context, every object needs to have a unique identifier. You use the #Id annotation to specify the primary key of an entity.
The #GeneratedValue annotation is used to specify how the primary key should be generated. In your example you are using an Identity strategy which
Indicates that the persistence provider must assign primary keys for
the entity using a database identity column.
There are other strategies, you can see more here.
Simply, #Id: This annotation specifies the primary key of the entity.
#GeneratedValue: This annotation is used to specify the primary key generation strategy to use. i.e Instructs database to generate a value for this field automatically. If the strategy is not specified by default AUTO will be used.
GenerationType enum defines four strategies:
Generation Type . TABLE,
Generation Type. SEQUENCE,
Generation Type. IDENTITY
Generation Type. AUTO
GenerationType.SEQUENCE
With this strategy, underlying persistence provider must use a database sequence to get the next unique primary key for the entities.
GenerationType.TABLE
With this strategy, underlying persistence provider must use a database table to generate/keep the next unique primary key for the entities.
GenerationType.IDENTITY This GenerationType indicates that the persistence provider must assign primary keys for the entity using a database identity column. IDENTITY column is typically used in SQL Server. This special type column is populated internally by the table itself without using a separate sequence. If underlying database doesn't support IDENTITY column or some similar variant then the persistence provider can choose an alternative appropriate strategy. In this examples we are using H2 database which doesn't support IDENTITY column.
GenerationType.AUTO This GenerationType indicates that the persistence provider should automatically pick an appropriate strategy for the particular database. This is the default GenerationType, i.e. if we just use #GeneratedValue annotation then this value of GenerationType will be used.
Reference:- https://www.logicbig.com/tutorials/java-ee-tutorial/jpa/jpa-primary-key.html
Why are we using this annotation?
First I would like to remind everyone that the annotations, such as #Id, are providing metadata to the persistence layer(I will assume hibernate).This metadata will most likely be stored in the .class file(but not stored in the database) and is used to tell hibernate how to recognize, interpret and manage the entity. So, Why are you using the annotation? To provide your persistence layer with the proper information about how to manage the entity.
Why use the #Id annotation?
The #Id annotation is one of the two mandatory annotations needed when creating an entity with JPA. The other one being #Entity. #Id does two things for us:
1) signifies that this field will be the unique identifier for this class when mapped to a database table
2) the presence of #Id lets the persistence layer know that all other fields within this class are to be mapped to database rows
Why use #GeneratedValue?
By marking the #Id field with #GeneratedValue we are now enabling id generation. Which means that the persistence layer will generate an Id value for us and handle the auto incrementing. Our application can choose 1 of 4 generations strategies:
1) AUTO
2) TABLE
3) SEQUENCE
4) IDENTITY
If not strategy is specified then AUTO is assumed
What is strategy = GenerationType.IDENTITY actually doing?
When we specify the generation strategy as GenerationType.IDENTITY we are telling the persistence provider(hibernate) to let the database handle the auto incrementing of the id. If you were to use postgres as an underling database and specified the strategy as IDENTITY, hibernate would execute this:
create table users (
id bigserial not null,
primary key (id)
)
Notice that they type of the id is bigserial, what is bigserial? As per the postgres documentation, bigserial is a large autoincrementing integer.
Conclusion
By specifying:
#Id
#GeneratedValue(strategy = GenerationType.IDENTITY)
private Integer id;
you have told the underlying persistence layer to use the id field as a unique identifier within the database. Also told the persistence layer to let the database handle the auto incrementing of the id with GenerationType.IDENTITY.
In very simple words, we want to tell our Database (DB) what strategy to use to generate primary keys.
Now, primary keys must be different for every different row so there must be some strategy that will tell the DB on how to differentiate one row from another.
GenerationType lets us define that strategy.
Here #GeneratedValue(stratergy=GenerationType.IDENTITY) is telling our DB to store primary key in the identity column which is a default column in SQL for default auto incremented primary key generation.

What side effects occur when reusing generator names?

I'm working in a code base of a couple dozen tables. I went to add a new class and, naturally, I'm going to look at what has been written before I got on the project to see how it's done over there. Something about wheel engineering?
Anyway, here's what I find
#Id
#SequenceGenerator(name = "identifier", sequenceName = "LIMIT_REASON_COLL_ID_SEQ", allocationSize = 1)
#GeneratedValue(strategy = GenerationType.SEQUENCE, generator = "identifier")
#Column(name = "ID")
Of note is the name = "identifier" and generator = "identifier. I didn't replace anything, that's actually what it says. And it's called "identifier" in every class managed by Hibernate.
Now, the system has been stable for years, so clearly it doesn't appear to impact whatever we're doing (that we can observe). But are there any side effects to reusing the generator name in this way? Is it advised, and if it isn't why not?
The SequenceGenerator's name and the GeneratedValue's generator can be whatever you choose to name them.
It's the GeneratedValue's strategy that references one of the Hibernate identifier generators:
increment
identity
sequence
hilo
seqhilo
uuid
uuid2
guid
assigned
select
foreign
There is no association between the generator's name and the strategy. So you can call your sequence generator by the "identity" name.
I think you will face problem related to duplicate key violation.. or somthing like this. Someday before i searched for the same and I got this post : JPA 2 #SequenceGenerator #GeneratedValue producing unique constraint violation . Please go though Question and answer.

What's the reason behind the jumping GeneratedValue(strategy=GenerationType.TABLE) when not specifying a #TableGenerator?

Why do I need to add allocationSize=1 when using the #TableGenerator to ensure that the id wouldn't jump from 1, 2,... to 32,xxx, 65,xxx,... after a jvm restart?
Is there a design reason for the need to specify the allocationSize?
This snippet would produce the jumping ids
#Id
#GeneratedValue(strategy = GenerationType.TABLE)
private Long id;
Here's the modified snippet that produces the properly sequenced ids
#Id
#GeneratedValue(strategy = GenerationType.TABLE, generator = "account_generator")
#TableGenerator(name = "account_generator", initialValue = 1, allocationSize = 1)
private Long id;
Hibernate caches a block of ids for performance reasons. It allocates several ids from database, keeps and if these run out it allocates another block from sequence (thus increasing the sequence value)
I'm not claiming it is the case but this might be a bug in the underlying generator used by Hibernate. See for example this post on Hibernate's forums that describes a weird behavior, the issues mentioned in the comments of the New (3.2.3) Hibernate identifier generators or existing issues in Jira.
My suggestion would be to identify the generator used in your case and to search for an existing issues or to open a new one.

Hibernate annotation for PostgreSQL serial type

I have a PostgreSQL table in which I have a column inv_seq declared as serial.
I have a Hibernate bean class to map the table. All the other columns are read properly except this column. Here is the declaration in the Hibernate bean class:
....
....
#GeneratedValue(strategy=javax.persistence.GenerationType.AUTO)
#Column(name = "inv_seq")
public Integer getInvoiceSeq() {
return invoiceSeq;
}
public void setInvoiceSeq(Integer invoiceSeq) {
this.invoiceSeq = invoiceSeq;
}
....
....
Is the declaration correct?
I am able to see the sequential numbers generated by the column in the database, but I am not able to access them in the java class.
Please help.
Danger: Your question implies that you may be making a design mistake - you are trying to use a database sequence for a "business" value that is presented to users, in this case invoice numbers.
Don't use a sequence if you need to anything more than test the value for equality. It has no order. It has no "distance" from another value. It's just equal, or not equal.
Rollback:
Sequences are not generally appropriate for such uses because changes to sequences are't rolled back with transaction ROLLBACK. See the footers on functions-sequence and CREATE SEQUENCE.
Rollbacks are expected and normal. They occur due to:
deadlocks caused by conflicting update order or other locks between two transactions;
optimistic locking rollbacks in Hibernate;
transient client errors;
server maintenance by the DBA;
serialization conflicts in SERIALIZABLE or snapshot isolation transactions
... and more.
Your application will have "holes" in the invoice numbering where those rollbacks occur. Additionally, there is no ordering guarantee, so it's entirely possible that a transaction with a later sequence number will commit earlier (sometimes much earlier) than one with a later number.
Chunking:
It's also normal for some applications, including Hibernate, to grab more than one value from a sequence at a time and hand them out to transactions internally. That's permissible because you are not supposed to expect sequence-generated values to have any meaningful order or be comparable in any way except for equality. For invoice numbering, you want ordering too, so you won't be at all happy if Hibernate grabs values 5900-5999 and starts handing them out from 5999 counting down or alternately up-then-down, so your invoice numbers go: n, n+1, n+49, n+2, n+48, ... n+50, n+99, n+51, n+98, [n+52 lost to rollback], n+97, .... Yes, the high-then-low allocator exists in Hibernate.
It doesn't help that unless you define individual #SequenceGenerators in your mappings, Hibernate likes to share a single sequence for every generated ID, too. Ugly.
Correct use:
A sequence is only appropriate if you only require the numbering to be unique. If you also need it to be monotonic and ordinal, you should think about using an ordinary table with a counter field via UPDATE ... RETURNING or SELECT ... FOR UPDATE ("pessimistic locking" in Hibernate) or via Hibernate optimistic locking. That way you can guarantee gapless increments without holes or out-of-order entries.
What to do instead:
Create a table just for a counter. Have a single row in it, and update it as you read it. That'll lock it, preventing other transactions from getting an ID until yours commits.
Because it forces all your transactions to operate serially, try to keep transactions that generate invoice IDs short and avoid doing more work in them than you need to.
CREATE TABLE invoice_number (
last_invoice_number integer primary key
);
-- PostgreSQL specific hack you can use to make
-- really sure only one row ever exists
CREATE UNIQUE INDEX there_can_be_only_one
ON invoice_number( (1) );
-- Start the sequence so the first returned value is 1
INSERT INTO invoice_number(last_invoice_number) VALUES (0);
-- To get a number; PostgreSQL specific but cleaner.
-- Use as a native query from Hibernate.
UPDATE invoice_number
SET last_invoice_number = last_invoice_number + 1
RETURNING last_invoice_number;
Alternately, you can:
Define an entity for invoice_number, add a #Version column, and let optimistic locking take care of conflicts;
Define an entity for invoice_number and use explicit pessimistic locking in Hibernate to do a select ... for update then an update.
All these options will serialize your transactions - either by rolling back conflicts using #Version, or blocking them (locking) until the lock holder commits. Either way, gapless sequences will really slow that area of your application down, so only use gapless sequences when you have to.
#GenerationType.TABLE: It's tempting to use #GenerationType.TABLE with a #TableGenerator(initialValue=1, ...). Unfortunately, while GenerationType.TABLE lets you specify an allocation size via #TableGenerator, it doesn't provide any guarantees about ordering or rollback behaviour. See the JPA 2.0 spec, section 11.1.46, and 11.1.17. In particular "This specification does not define the exact behavior of these strategies. and footnote 102 "Portable applications should not use the GeneratedValue annotation on other persistent fields or properties [than #Id primary keys]". So it is unsafe to use #GenerationType.TABLE for numbering that you require to be gapless or numbering that isn't on a primary key property unless your JPA provider makes more guarantees than the standard.
If you're stuck with a sequence:
The poster notes that they have existing apps using the DB that use a sequence already, so they're stuck with it.
The JPA standard doesn't guarantee that you can use generated columns except on #Id, you can (a) ignore that and go ahead so long as your provider does let you, or (b) do the insert with a default value and re-read from the database. The latter is safer:
#Column(name = "inv_seq", insertable=false, updatable=false)
public Integer getInvoiceSeq() {
return invoiceSeq;
}
Because of insertable=false the provider won't try to specify a value for the column. You can now set a suitable DEFAULT in the database, like nextval('some_sequence') and it'll be honoured. You might have to re-read the entity from the database with EntityManager.refresh() after persisting it - I'm not sure if the persistence provider will do that for you and I haven't checked the spec or written a demo program.
The only downside is that it seems the column can't be made # NotNull or nullable=false, as the provider doesn't understand that the database has a default for the column. It can still be NOT NULL in the database.
If you're lucky your other apps will also use the standard approach of either omitting the sequence column from the INSERT's column list or explicitly specifying the keyword DEFAULT as the value, instead of calling nextval. It won't be hard to find that out by enabling log_statement = 'all' in postgresql.conf and searching the logs. If they do, then you can actually switch everything to gapless if you decide you need to by replacing your DEFAULT with a BEFORE INSERT ... FOR EACH ROW trigger function that sets NEW.invoice_number from the counter table.
I have found that hibernate 3.6 tries to use a single sequence for all entities when you set it to AUTO so in my application I use IDENTITY as the generation strategy.
#Id
#Column(name="Id")
#GeneratedValue(strategy=GenerationType.IDENTITY)
private Integer id;
#Craig had some very good points about invoice numbers needing to incrementing if you are presenting them to users and he suggested using a table for that. If you do end up using a table to store the next id you might be able to use a mapping similar to this one.
#Column(name="Id")
#GeneratedValue(strategy=GenerationType.TABLE,generator="user_table_generator")
#TableGenerator(
name="user_table_generator",
table="keys",
schema="primarykeys",
pkColumnName="key_name",
pkColumnValue="xxx",
valueColumnName="key_value",
initialValue=1,
allocationSize=1)
private Integer id;
Correct syntax is as follows:
#Column(name="idClass", unique=true, nullable=false, columnDefinition = "serial")
#Generated(GenerationTime.INSERT)
private Integer idClass;
Depending on your situation, this may not work. There is a bug opened against Hibernate that documents this behavior.
http://opensource.atlassian.com/projects/hibernate/browse/HHH-4159
If you are open to using a mapping file instead of annotations, I have been able to recreate the issue (NULL's in SERIAL columns that are not part of the primary key). Using the "generated" attribute of the property element causes Hibernate to re-read the row after an insert to pick up the generated column value:
<class name="Ticket" table="ticket_t">
<id name="id" column="ticket_id">
<generator class="identity"/>
</id>
<property name="customerName" column="customer_name"/>
<property name="sequence" column="sequence" generated="always"/>
</class>
I use Postgres + Hibernate in my projects and this is what I do:
#Id
#GeneratedValue(strategy = GenerationType.IDENTITY, generator = "hibernate_sequence")
#SequenceGenerator(name = "hibernate_sequence", sequenceName = "hibernate_sequence")
#Column(name = "id", unique = true, nullable = false)
protected Long id;
public Long getId() {
return id;
}
public void setId(Long id) {
this.id = id;
}
It works just fine for me.

Categories