How to fill a closure table using JPA?

How to fill a closure table using JPA? - java

I am trying to model a hierarchy of objects (actually, domain groups) in a database. I decided to use a closure table, so that I can gain high flexibility in querying the hierarchy. Basically, my schema looks something like this:
CREATE TABLE group (
id INT -- primary key
... -- other fields here
)
CREATE TABLE groupHierarchy (
idAncestor INT,
idGroup INT,
hierarchyLevel INT
)
So, when a group with an id of 1 contains a group with an id of 2, which in turn contains a group with an id of 3, I will need to have following rows in the groupHierarchy table.
idAncestor idGroup hierarchyLevel
1 1 0
2 2 0
3 3 0
1 2 1
2 3 1
1 3 2
I am also OK with not having the rows with the hierarchyLevels of 0 (self - reference).
Now I would like to have an JPA entity that would map to the group table. My question is - what would be a good way to manage the groupHierarchy table?
What I already considered is:
1) Having the group hierarchy mapped as an element collection, like :
#ElementCollection
#JoinTable(name = "groupHierarchy")
#MapKeyJoinColumn(name = "idAncestor")
#Column(name = "hierarchyLevel")
Map<Group, Integer> ancestors;
This would require handling the hierarchy entirely in the application, and I am afraid that this may become very complex.
2) Make the application unaware of the hierarchyLevel column and handle it in the database using a trigger (when a record is added, check if the parent already has ancestors and if so, add any other required rows. This is also where the hierarchyLevel of 0 would come in handy). It seems to me that the database trigger would be simpler, but I'm not sure if that would be good for the overall readability.
Can anyone suggest other options? Or maybe point to any pros or cons of the solutions I have mentioned?

May I suggest you using JpaTreeDao? I think it's complete and very well documented. I'm going to try to port the closure tables code to a groovy implementation...

Related

Hibernate GenerationType.IDENTITY not generating sequence ids

In my springboot application, I noticed one strange issue when inserting new rows.
My ids are generated by sequence, but after I restart the application it starts from 21.
Example:
First launch, I insert 3 rows - ids generated by sequence 1,2,3
After restart second launch, I insert 3 rows ids generated from 21. So ids are 21,22 ...
Every restart It increased to 20. - This increasing pattern always 20
Refer my database table (1,2 after restart 21)
My JPA entity
#Id
#GeneratedValue(strategy = GenerationType.IDENTITY)
#Column(unique = true, nullable = false)
private Long id;
I tried some stackoverflow solutions, it's not working
I tried this, not working
spring.jpa.properties.hibernate.id.new_generator_mappings=false
I want to insert rows by sequence like 1,2,3,4. Not like this 1,2,21,22, How to resolve this problem?

Although I think the question comments already provide all the information necessary to understand the problem, please, let me try explain some things and try fixing some inaccuracies.
According to your source code you are using the IDENTITY id generation strategy:
#Id
#GeneratedValue(strategy = GenerationType.IDENTITY)
#Column(unique = true, nullable = false)
private Long id;
You are using an Oracle database and this is a very relevant information for the question.
Support for IDENTITY columns was introduced in Oracle 12c, probably Release 1, and in Hibernate version - I would say 5.1 although here in SO is indicated that you need at least - 5.3.
Either way, IDENTITY columns in Oracle are supported by the use of database SEQUENCEs: i.e., for every IDENTITY column a corresponding sequence is created. As you can read in the Oracle documentation this explain why, among others, all the options for creating sequences can be applied to the IDENTITY column definition, like min and max ranges, cache size, etc.
By default a sequence in Oracle has a cache size of 20 as indicated in a tiny note in the aforementioned Oracle documentation:
Note: When you create an identity column, Oracle recommends that you
specify the CACHE clause with a value higher than the default of 20 to
enhance performance.
And this default cache size is the reason that explains why you are obtaining this non consecutive numbers in your id values.
This behavior is not exclusive to Hibernate: please, just issue a simple JDBC insert statement or SQL commands with any suitable tool and you will experiment the same.
To solve the issue create your table indicating NOCACHE for your IDENTITY column:
CREATE TABLE your_table (
id NUMBER GENERATED BY DEFAULT ON NULL AS IDENTITY NOCACHE,
--...
)
Note you need to use NOCACHE and not CACHE 0 as indicated in the question comments and now in a previous version of other answers, which is an error because the value for the CACHE option should be at least 2.
Probably you could modify your column without recreating the whole table as well:
ALTER TABLE your_table MODIFY (ID GENERATED BY DEFAULT ON NULL AS IDENTITY NOCACHE);
Having said all that, please, be aware that in fact the cache mechanism is an optimization and not a drawback: in the end, and this is just my opinion, those ids are only non natural assigned IDs and, in a general use case, the cache benefits outweigh the drawbacks.
Please, consider read this great article about IDENTITY columns in Oracle.
The provided answer related to the use of the hilo optimizer could be right but it requires explicitly using the optimizer in your id field declaration which seems not to be the case.

It is related to Hi/Lo algorithm that Hibernate uses for incrementing the sequence value. Read more in this example: https://www.baeldung.com/hi-lo-algorithm-hibernate.
This is an optimization used by Hibernate, which consumes some values from the DB sequence into a pool (Java runtime) and uses them while executing appropriate INSERT statements on the table. If this optimization is turned off and set allocationSize=1, then the desired behavior (no gaps in ids) is possible (with a certain precision, not always), but for the price of making two requests to DB for each INSERT.
Examples give the idea of what is going on in the upper level of abstraction.
(Internal implementation is more complex, but here we don't care)
Scenario: user makes 21 inserts during some period of time
Example 1 (current behavior allocationSize=20)
#1 insert: // first cycle
- need next MY_SEQ value, but MY_SEQ_PREFETCH_POOL is empty
- select 20 values from MY_SEQ into MY_SEQ_PREFETCH_POOL // call DB
- take it from MY_SEQ_PREFETCH_POOL >> remaining=20-1
- execute INSERT // call DB
#2-#20 insert:
- need next MY_SEQ value,
- take it from MY_SEQ_PREFETCH_POOL >> remaining=20-i
- execute INSERT // call DB
#21 insert: // new cycle
- need next MY_SEQ value, but MY_SEQ_PREFETCH_POOL is empty
- select 20 values from MY_SEQ into MY_SEQ_PREFETCH_POOL // call DB
- take it from MY_SEQ_PREFETCH_POOL >> remaining=19
- execute INSERT // call DB
Example 2 (current behavior allocationSize=1)
#1-21 insert:
- need next MY_SEQ value, but MY_SEQ_PREFETCH_POOL is empty
- select 1 value from MY_SEQ into MY_SEQ_PREFETCH_POOL // call DB
- take it from MY_SEQ_PREFETCH_POOL >> remaining=0
- execute INSERT // call DB
Example#1: total calls to DB is 23
Example#2: total calls to DB is 42
Manual declaration of the sequence in the database will not help in this case, because, for instance in this statement\
CREATE SEQUENCE ABC START WITH 1 INCREMENT BY 1 CYCLE NOCACHE;
we control only "cache" used in the DB internal runtime, which is not visible to Hibernate. It affects sequence gaps in situations when DB stopped and started again, and this is not the case.
When Hibernate consumes values from the sequence it implies that the state of the sequence is changed on DB side. We may treat it as hotel rooms booking: a company (Hibernate) booked 20 rooms for a conference in a hotel (DB), but only 2 participants arrived. Then 18 rooms will stay empty and cannot be used by other guests. In this case the "booking period" is forever.
More details on how to configure Hibernate work with sequences is here:
https://ntsim.uk/posts/how-to-use-hibernate-identifier-sequence-generators-properly
Here is a short answer for older version of Hibernate. Still it has relevant ideas:
https://stackoverflow.com/a/5346701/2774914

How to return no matched row in Pentaho Data Inegration (Kettle)?

I look for a solution to perform SSIS lookup in Pentaho Data Integration.
I'll try to explain with an exemple :
I have two tables A and B.
Here , data in table A :
1
2
3
4
5
Here , data in table B:
3
4
5
6
7
After my process :
All rows in A and not in B ==> will be insert to B
All rows in B and not in A ==> will be deleted to A
So , here my final Table B :
3
4
5
1
2
someone can help me please ?

There is indeed a step that does this, but it doesn't do it alone. It's the Merge rows(diff) step and it has some requirements. In your case, A is the "compare" table and B is the "reference" table.
First of all, both inputs (rows from A and B in your case, Dev and Prod in mine) need to be sorted by a key value. In the step you specify the key fields to match on, and then the value fields to compare. The step adds a field to the output (by default called 'flagfield'). After comparing each row, this field is given one of four values: "new", "changed", "deleted", or "identical". Note in my example below I have explicit sort steps. That's because the sorting scheme of my database is not compatible with PDI's, and for this step to work, your data must be in PDI's sort order. You may not need these.
You can follow this with a Synchronize after merge step to apply the identified changes. In this step you specify the flagfield and the values that correspond to insert, update, and delete. FYI these are specified on the "Advanced" tab, and they must be filled out for the step to work.
For a very small table like your example, I would favor just a truncate and full load with a Table output step, but if your tables are large and the number of changes relatively small (<= ~25%) and replication is not available, this step is usually the way to go.

In Pentaho direct step is not availble. There are so many ways to do these.
=> Writing sql's to achieve your solution. If you write sql's execution speed also faster.
=> Using filter step also you can acheive.
Thank you.

Inserting map into database using hibernate

I have a table with four columns, say
P_Key(int), Ref_Key(int), Key(String), Value(Integer).
I want to avoid multiple create statements by persisting a map along with Ref_key into this database. Map contains keys and corresponding value.It will create multiple row for that Ref_key using entries from the map.I am using hibernate.
Suppose, I want to persist following map :
"Me" -> 0
"You" -> 10
"They" -> 12
and Ref_Key is 123.
Then it should create 3 rows into table.
P_Key Ref_Key Key Value
1 123 "Me" 0
2 123 "You" 10
3 123 "They" 12
Assuming that key starts with 1 and is auto-incremented.
What is the approach that I should follow?

If you mean "insert statements" instead of "create statements"; I am not sure you can: a database has to receive 3 INSERTs for adding 3 lines in the database. But Hibernate should merge them and send only 1 network request (batching of requests).
The approach I suggest:
Create an #Entity with the table format.
Create one instance of your entity for each element of your map.
Simply persist all the instances.
Close your transaction (I mean flush).
See http://docs.jboss.org/hibernate/orm/3.3/reference/en/html/batch.html for more information about batching. NOTE: Unfortunately, Hibernate disables the batching for the auto incremented primary keys!

JPA #version - can it be used to calcualate version of a table entry

Please consider the following table (created using a corresponding entity)
request
-------
id requestor type version items
1 a t1 1 5
2 a t1 2 3
3 b t1 1 2
4 a t2 1 4
5 a t1 3 9
The above is what I want to achieve. The version field is a calculated field others are user provided.
Basically the request's version needs to be calculated based on the combination of requestor and the type. The first occurance with a given combination will have a version 1 then version 2 and so on.
I tried various things using #version on a different entity with just the three columns and joining the two entities using ManytoOne etc but I'm not able to get to the desired outcome. I dont want to confuse you with the things I tried.
Since the objective is simple there should be an easier way I suppose?
Can you please help? - any help greatly appreciated!
thanks in advance

The Version annotation is used to specify the version field or property of an entity class that serves as its optimistic lock value. I don't think this has anything to do with what you're looking for. Here are some ideas though:
Use a trigger on insert.
Use a SQL View.
Use a provider specific extension for derived attributes like Hibernate's #Formula (this annotation that lets you map a property to an SQL expression rather than a table column).
To illustrate the latest option, you could declare something like this:
#Formula("(select count(r.id)+1 from Request r where r.requestor=requestor and r.type=type and r.id<id)")
public long getVersion() { return version; }

How to manage consecutive column values in table rows

A little presentation for what I want to do:
Consider the case where different people from a firm get, once a year, an all expenses paid trip to somewhere. There may be 1000 persons that could qualify for the trip but only 16 places are available.
Each of this 16 spots has an associated index which must be from 1 to 16. The ones on the reservation have index starting from 17.
The first 16 persons that apply get a definite spot on the trip. The rest end up on the reservation list. If one of the first 16 persons cancels, the first person with a reservation gets his place and all the indexes are renumbered to compensate for the person that canceled.
All of this is managed in a Java web app with an Oracle DB.
Now, my problem:
I have to manage the index in a correct way (all sequential, no duplicate indexes), with possible hundreds of people that simultaneously apply for the trip.
When inserting a record in the table for the trip, the way of getting the index is by
SELECT MAX(INDEX_NR) + 1 AS NEXT_INDEX_NR FROM TABLE
and using this as the new index (this is done Java side and then a new query to insert the record). It is obvious why we have multiple spots or reservations with the same index. So, we get, let’s say, 19 people on the trip because 4 of them have index 10, for example.
How can I manage this? I have been thinking of 3 ways so far:
Use an isolation level of Serializable for the DB transactions (don’t like this one);
Insert a record with no INDEX_NR and then have a trigger manage the things… in some way (never worked with triggers before);
Each record also has a UPDATED column. Could I use this in some way? (note that I can’t lose the INDEX_NR since other parts of the app make use of it).
Is there a best way to do this?

Why make it complicated ?
Just insert all reservations as they are entered and insert a timestamp of when they resevered a spot.
Then in you query just use the timestamp to sort them.
There is offcourse the chance that there are people that reserved a spot at the very same millisecond then just use a random method to assign order.

Why do you need to explicitly store the index? Instead you could store each person's order (which never changes) along with an active flag. In your example if person #16 pulls out you simply mark them as inactive.
To compute whether a person qualifies for the trip you simply count the number of active people with order less than that person:
select count(*)
from CompetitionEntry
where PersonOrder < 16
and Active = 1
This approach removes the need for bulk updates to the database (you only ever update one row) and hence mostly mitigates your problem of transactional integrity.

Another way would be to explicitly lock a record on another table on the select.
-- Initial Setup
CREATE TABLE NUMBER_SOURCE (ID NUMBER(4));
INSERT INTO NUMBER_SOURCE(ID) VALUES 0;
-- Your regular code
SELECT ID AS NEXT_INDEX_NR FROM NUMBER_SOURCE FOR UPDATE; -- lock!
UPDATE NUMBER_SOURCE SET ID = ID + 1;
INSERT INTO TABLE ....
COMMIT; -- releases lock!
No other transaction will be able to perform the query on the table NUMBER_SOURCE until the commit (or rollback).

When adding people to the table, give them an ID in such a way that the ID is ascending in the order in which they were added. This can be a timestamp.
Select all the records from the table which qualify, order by ID, and update their INDEX_NR
Select * from table where INDEX_NR <= 16 order by INDEX_NR
Step #2 seems complicated but it's actually quite simple:
update (
select *
from TABLE
where ...
order by ID
)
set INDEX_NR = INDEXSEQ.NEXTVAL
Don't forget to reset the sequence to 1.

Calculate your index in runtime:
CREATE OR REPLACE VIEW v_person
AS
SELECT id, name, ROW_NUMBER() OVER (ORDER BY id) AS index_rn
FROM t_person
CREATE OR REPLACE TRIGGER trg_person_ii
INSTEAD OF INSERT ON v_person
BEGIN
INSERT
INTO t_person (id, name)
VALUES (:new.id, :new.name);
END;
CREATE OR REPLACE TRIGGER trg_person_iu
INSTEAD OF UPDATE ON v_person
BEGIN
UPDATE t_person
SET id = :new.id,
name = :new.name
WHERE id = :old.id;
END;
CREATE OR REPLACE TRIGGER trg_person_id
INSTEAD OF DELETE ON v_person
BEGIN
DELETE
FROM t_person
WHERE id = :old.id;
END;
INSERT
INTO v_person
VALUES (1, 'test', 1)
SELECT *
FROM v_person
--
id name index_rn
1 test 1
INSERT
INTO v_person
VALUES (2, 'test 2', 1)
SELECT *
FROM v_person
--
id name index_rn
1 test 1
2 test 2 2
DELETE
FROM v_person
WHERE id = 1
SELECT *
FROM v_person
--
id name index_rn
2 test 2 1

"I have to manage the index in a correct way (all sequential, no duplicate indexes), with possible hundreds of people that simultaneously apply for the trip.
When inserting a record in the table for the trip, the way of getting the index is by
SELECT MAX(INDEX_NR) + 1 AS NEXT_INDEX_NR FROM TABLE
and using this as the new index (this is done Java side and then a new query to insert the record). It is obvious why we have multiple spots or reservations with the same index."
Yeah. Oracle's MVCC ("snapshot isolation") used incorrectly by someone who shouldn't have been in IT to begin with.
Really, Peter is right. Your index number is, or rather should be, a sort of "ranking number" on the ordered timestamps that he mentions (this holds a requirement that the DBMS can guarantee that any timestamp value appears only once in the entire database).
You say you are concerned with "regression bugs". I say "Why do you need to be concerned with "regression bugs" in an application that is DEMONSTRABLY beyond curing ?". Because your bosses paid a lot of money for the crap they've been given and you don't want to be the pianist that gets shot for bringing the message ?

The solution depends on what you have under your control. I assume that you can change both database and Java code, but refrain from modifying the database scheme since you had to adapt too much Java code otherwise.
A cheap solution might be to add a uniqueness constraint on the pair (trip_id, index_nr) or just on index_nr if there is just one trip. Additionally add a check contraint check(index_nr > 0) - unless index_nr is already unsigned. Everything else is then done in Java: When inserting a new applicant as described by you, you have to add code catching the exception when someone else got inserted concurrently. If some record is updated or deleted, you either have to live with holes between sequence numbers (by selecting the 16 candidates with the lowest index_nr as shown by Quassnoi in his view) or fill them up by hand (similarily to what Aaron suggested) after every update/delete.
If index_nr is mostly used in the application as read-only, a better solution might be to combine the answers of Peter and Quassnoi: Use either a time stamp (automatically inserted by the database by defining the current time as default) or an auto-incremented integer (as default inserted by the database) as value stored in the table. And use a view (like the one defined by Quassnoi) to access the table and the automatically calculated index_nr from Java. But also define both constraints like for the cheap solution.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

How to fill a closure table using JPA? - java

May I suggest you using JpaTreeDao? I think it's complete and very well documented. I'm going to try to port the closure tables code to a groovy implementation...

Related

Hibernate GenerationType.IDENTITY not generating sequence ids

How to return no matched row in Pentaho Data Inegration (Kettle)?

Inserting map into database using hibernate

JPA #version - can it be used to calcualate version of a table entry

How to manage consecutive column values in table rows

Categories

Resources