increment or sequence instead of table generation JPA - java

I am developing an application that support multiple databases and hibernate fulfilling that requirement.
Now the issue is in primary auto generate key. some databases support auto increment and some required sequence for increment the identity. to solve this issue the use the following strategy
strategy = GenerationType.TABLE (javax.persistence)
This is fulfilling my requirement.
in this post, a user comment that
its always better to use increment or sequence instead of table generation if you need the ids to be in sequence
If I use the auto increment or sequence, it means it required some changes # annotation level, when I move one database to another (extra burden)
update me , it is really better to use increment or sequence instead of table generation or it is just a statement?

Auto increment drawbacks: You don't know the id until the transaction has committed (which can be a problem in JPA, since some EntityManager operations rely on Id's). Not all databases support auto increment fields.
Sequence drawbacks: Not all databases have sequences.
Table drawbacks: Id's are not necessarily consecutive.
Since it is very unlikely that you run out of Id's, using Table generation remains a good option. You can even tweak the id allocation size in order to use more consecutive id's (default size is 50):
#TableGenerator(name="myGenerator", allocationSize=1)
However, this will result in at least two queries to the id allocation table for each insert: one to step the value of the latest id, and one to retrieve it.

Related

Manual inserts while using hibernate #GeneratedValue

I'm using hibernate to auto generate IDs for table, but i need to manually insert some rows (about 10k, just once) related to another table. I'm using Oracle DB. How can I do that? How does hibernate generete values? It is possible to use it?
#Id
#GeneratedValue
private Long id;
Of course that's possible, we're doing that all the time. Whether and how depends on the id generation strategy you use and how your database is set up.
We're using a (customized) table generator that generates positive ids so whenever we need to manually insert elements we use negative ids. That way those ids don't interfere with Hibernate's id generation and we are able to immediately identify manually inserted rows.
If you don't like negative ids you could use a different generation strateg, e.g.
a sequence on the id column that is used by Hibernate as well as manual inserts
a high-low table generator (that's what we're using) with the initial low value set to some higher value and thus essentially reserving the lower positive values for manual inserts)
an "assigned" id generator, i.e. your application defines the id (e.g. an employee's employee-id) and thus you'd know which ids can be added manually
See, #GeneratedValue will only work if You call hibernate API.
To use Autoincrement values, we don't need hibernate #GeneratedValue feature.
You can enable the auto to generate from Database itself. mark a column auto generate.
Refer to :
https://chartio.com/resources/tutorials/how-to-define-an-auto-increment-primary-key-in-oracle/
While inserting don't include the column name and values in your bulk insert Query for column marked as Auto increment.

Hibernate IDENTITY vs SEQUENCE entity identifier generators

This article says:
Unlike identity, the next number for the column value will be retrieved from memory rather than from the disk – this makes Sequence significantly faster than Identity
Does it mean that ID comes from disk in case of identity? If yes, then which disk and how?
Using sequence, I can see in the log, an extra select query to DB while inserting a new record. But I didn't find that extra select query in the log in case of identity.
Then how sequence becomes faster than identity?
Strategy used by sequence:
Before inserting a new row, ask the database for the next sequence value, then insert this row with the returned sequence value as ID.
Strategy used by identity:
Insert a row without specifying a value for the ID. After inserting the row, ask the database for the last generated ID.
The number of queries is thus the same in both cases. But, Hibernate uses by default a strategy that is more efficient for the sequence generator. In fact, when it asks for the next sequence value, it keeps th 50 (that's the dafault, IIRC, and it's configurable) next values in memory, and uses these 50 next values for the next 50 inserts. Only after 50 inserts, it goes to the database to get the 50 next values. This tremendously reduces the number of needed SQL queries needed for automatic ID generation.
The identity strategy doesn't allow for such an optimization.
The IDENTITY generator will always require a database hit for fetching the primary key value without waiting for the flush to synchronize the current entity state transitions with the database.
So the IDENTITY generator doesn't play well with Hibernate write-behind first level cache strategy, therefore JDBC batching is disabled for the IDENTITY generator.
The sequence generator can benefit from database value preallocation and you can even employ a hi/lo optimization strategy.
In my opinion, the best generators are the pooled and pooled-lo sequence generators. These generators combine the batch-friendly sequence generator with a client-side value generation optimization that's compatible with other DB clients that may insert rows without knowing anything about our generation strategy.
Anyway, you should never choose the TABLE generator because it performs really bad.
Though I'm personally new to Hibernate, from what I can recall, using Identity basically means that Hibernate will check what is the next possible id value from your DB and keep a value for it.
For sequence, you basically tell Hibernate to generate the next value based on a particular sequence you provide it. So it has to actually calculate the next id by looking at the next possible id value. Hence, the extra query is fired.
maybe this will answer your question :
Unlike identity column values, which are generated when rows are
inserted, an application can obtain the next sequence number before
inserting the row by calling the NEXT VALUE FOR function. The sequence
number is allocated when NEXT VALUE FOR is called even if the number
is never inserted into a table. The NEXT VALUE FOR function can be
used as the default value for a column in a table definition. Use
sp_sequence_get_range to get a range of multiple sequence numbers at
once.
you can find the detail here
Identity doesnt need that extra select query because Identity is a table dependent and Sequence is independent from table, but because of this we can get sequence even before creating a row(when you do session.save(T entity), sequence is generated even before you commit the transaction).
sequence :
you create or update entities -> each time you save entity -> hibernate get next sequence value -> your program return the value after all process complete without exception or rollback -> you commit all transaction -> hibernate insert all complete entity
identity : when commit transaction, insert incomplete entity(must get it from identity column). so the INSERT command of sequence is definitely slower, but the advantages is if you cancel the insert the count doesn't increasing.

Hibernate, what is the most efficient id generation strategy?

I need to insert many entities into the database via Hibernate. So, I want to find the most effective algorithm for Id generation.
Accordingly Hibernate Documentation exists four widely used generation strategies:
IDENTITY
SEQUENCE
TABLE
AUTO
I should use MySQL database, so I cannot apply SEQUENCE generation strategy. What about other strategies? What is the most efficient from performance point of view?
The best id generators in Hibernate are enhanced-table and enhanced-sequence, coupled with an appropriate optimizer, such as hilo. I have experience with enhanced-table + hilo, inserting over 10,000 records per second.
BTW the statement that "hilo needs an additional query per generated entity" is patently false: the whole point of the optimizer is to prevent this.
As you can't use SEQUENCE, and AUTO just automatically selects a supported generator algorithm out of the existing ones, you are left with IDENTITY and TABLE.
TABLE: uses a hi/lo algorithm to efficiently generate identifiers of type long, short or int, given a table and column as a source of hi values. The hi/lo algorithm generates identifiers that are unique only for a particular database. -> Means an extra query per generated entity. (This is not true if you use optimizers. Unfortunately, using no optimizer generally is the default, if no optimizer was specified.)
IDENTITY: supports identity columns in DB2, MySQL, MS SQL Server, Sybase and HypersonicSQL. -> Performance-wise, this is the way to go, the same way you would do without Hibernate normally. Database generated, almost no overhead.
There exist more Hibernate specific generators, but they won't beat performance-wise the database generated ID. (See 5.1.2.2.1. Various additional generators in your linked document.)

Hibernate ID Generator Confusion

I am using Hibernate 3.0 in my application with Postgres database. It is a monitoring application and gathers data every minute. So we have thousands of rows in some tables every month.
Currently i am using sequence for generating Id in hibernate. Is there any better option according to this scenario?
Any suggestion will be appreciated.
IMHO sequence is the best approach because it gives you more flexibility although you may also use identity (auto-increment) column. I think it postgres it is called serial and there is also a way to store ids in sepearate table. To address these 3 approach you may use
appropriately :
#GeneratedValue(strategy=GenerationType.TABLE)
#GeneratedValue(strategy=GenerationType.SEQUENCE)
#GeneratedValue(strategy=GenerationType.IDENTITY)
As for your previous question whether it is good to use single sequence for all tables. I wouldn't recommend this approach becasue db must assert that all sequence numbers are unique that is why each sequence generated value needs to be synchronized by the db server. If you have single sequence per db it may cause performace issues when multiple requests from multiple tables asks for next id value. I would rather recommend to have single sequence per table.
While I am not sure if there is a better alternative than using a sequence, I am pretty sure that you would want to look at using StatelessSession if this is just for gathering data. You can get rid of all the overhead for e.g 1st level cache, transactional write-behind etc

Alright to truncate database tables when also using Hibernate?

Is it OK to truncate tables while at the same time using Hibernate to insert data?
We parse a big XML file with many relationships into Hibernate POJO's and persist to the DB.
We are now planning on purging existing data at certain points in time by truncating the tables. Is this OK?
It seems to work fine. We don't use Hibernate's second level cache. One thing I did notice, which is fine, is that when inserting we generate primary keys using Hibernate's #GeneratedValue where Hibernate just uses a key value one greater than the highest value in the table - and even though we are truncating the tables, Hibernate remembers the prior value and uses prior value + 1 as opposed to starting over at 1. This is fine, just unexpected.
Note that the reason we do truncate as opposed to calling delete() on the Hibernate POJO's is for speed. We have gazillions of rows of data, and truncate is just so much faster.
We are now planning on purging existing data at certain points in time by truncating the tables. Is this OK?
If you're not using the second level cache and if you didn't load Entities from the table you're going to truncate in the Session, the following should work (assuming it doesn't break integrity constraints):
Session s = sf.openSession();
PreparedStatement ps = s.connection().prepareStatement("TRUNCATE TABLE XXX");
ps.executeUpdate();
And you should be able to persist entities after that, either in the same transaction or another one.
Of course, such a TRUNCATE won't generate any Hibernate event or trigger any callback, if this matters.
(...) when inserting we generate primary keys using Hibernate's #GeneratedValue (...)
If you are using the default strategy for #GeneratedValue (i.e. AUTO), then it should default to a sequence with Oracle and a sequence won't be reseted if you truncate a table or delete records.
We truncate tables like jdbcTemplate.execute("TRUNCATE TABLE abc")
This should be equivalent (you'll end-up using the same underlying JDBC connection than Hibernate).
What sequence would Hibernate use for the inserts?
AFAIK, Hibernate generates a default "hibernate_sequence" sequence for you if you don't declare your own.
I thought it was just doing a max(field) + 1 on the table?
I don't think so and the fact that Hibernate doesn't start over from 1 after the TRUNCATE seems to confirm that it doesn't. I suggest to activate SQL logging to see the exact statements performed against your database on INSERT.
The generator we specify for #GeneratedValue is just a "dummy" generator (doesn't correspond to any sequence that we've created).
I'm not 100% sure but if you didn't declare any #SequenceGenerator (or #TableGenerator), I don't think that specifying a generator changes something.
Depends on your application. If deleting rows in the database is okey, then truncate is okey, too.
As far as you don't have any Pre- or PostRemove listeners on your entities, there should be no problems.
On the other hand... is it possible that there are still entities loaded in an EntityManager at truncate time, or is this a writeonly table (like a logging table). In this case you won't have any problem at all.

Categories