How to save entities with manually assigned identifiers using Spring Data JPA? - java

I'm updating an existing code that handles the copy or raw data from one table into multiple objects within the same database.
Previously, every kind of object had a generated PK using a sequence for each table.
Something like that :
#Id
#Column(name = "id")
#GeneratedValue(strategy = GenerationType.IDENTITY)
private Integer id;
In order to reuse existing IDs from the import table, we removed GeneratedValue for some entities, like that :
#Id
#Column(name = "id")
private Integer id;
For this entity, I did not change my JpaRepository, looking like this :
public interface EntityRepository extends JpaRepository<Entity, Integer> {
<S extends Entity> S save(S entity);
}
Now I'm struggling to understand the following behaviour, within a spring transaction (#Transactional) with the default propagation and isolation level :
With the #GeneratedValue on the entity, when I call entityRepository.save(entity) I can see with Hibernate show sql activated that an insert request is fired (however seems to be only in the cache since the database does not change)
Without the #GeneratedValue on the entity, only a select request is fired (no insert attempt)
This is a big issue when my Entity (without generated value) is mapped to MyOtherEntity (with generated value) in a one or many relationship.
I thus have the following error :
ERROR: insert or update on table "t_other_entity" violates foreign key constraint "other_entity_entity"
Détail : Key (entity_id)=(110) is not present in table "t_entity"
Seems legit since the insert has not been sent for Entity, but why ? Again, if I change the ID of the Entity and use #GeneratedValue I don't get any error.
I'm using Spring Boot 1.5.12, Java 8 and PostgreSQL 9

You're basically switching from automatically assigned identifiers to manually defined ones which has a couple of consequences both on the JPA and Spring Data level.
Database operation timing
On the plain JPA level, the persistence provider doesn't necessarily need to immediately execute a single insert as it doesn't have to obtain an identifier value. That's why it usually delays the execution of the statement until it needs to flush, which is on either an explicit call to EntityManager.flush(), a query execution as that requires the data in the database to be up to date to deliver correct results or transaction commit.
Spring Data JPA repositories automatically use default transactions on the call to save(…). However, if you're calling repositories within a method annotated with #Transactional in turn, the databse interaction might not occur until that method is left.
EntityManager.persist(…) VS. ….merge(…)
JPA requires the EntityManager client code to differentiate between persisting a completely new entity or applying changes to an existing one. Spring Data repositories w ant to free the client code from having to deal with this distinction as business code shouldn't be overloaded with that implementation detail. That means, Spring Data will somehow have to differentiate new entities from existing ones itself. The various strategies are described in the reference documentation.
In case of manually identifiers the default of inspecting the identifier property for null values will not work as the property will never be null by definition. A standard pattern is to tweak the entities to implement Persistable and keep a transient is-new-flag around and use entity callback annotations to flip the flag.
#MappedSuperclass
public abstract class AbstractEntity<ID extends SalespointIdentifier> implements Persistable<ID> {
private #Transient boolean isNew = true;
#Override
public boolean isNew() {
return isNew;
}
#PrePersist
#PostLoad
void markNotNew() {
this.isNew = false;
}
// More code…
}
isNew is declared transient so that it doesn't get persisted. The type implements Persistable so that the Spring Data JPA implementation of the repository's save(…) method will use that. The code above results in entities created from user code using new having the flag set to true, but any kind of database interaction (saving or loading) turning the entity into a existing one, so that save(…) will trigger EntityManager.persist(…) initially but ….merge(…) for all subsequent operations.
I took the chance to create DATAJPA-1600 and added a summary of this description to the reference docs.

Related

JPA Query with several different #Id columns

Problem
To make my code cleaner i want to introduce a generic Repository that each Repository could extend and therefore reduce the code i have to have in each of them. The problem is, that the Ids differ from Class to Class. On one (see example below) it would be id and in the other randomNumber and on the other may even be an #EmbeddedId. I want to have a derived (or non derived) query in the respository that gets One by id.
Preferred solution
I Imagine having something like:
public interface IUniversalRepository<T, K>{
#Query("select t from # {#entityName} where #id = ?1")
public T findById(K id);
}
Ecample Code
(that does not work because attribute id cannot be found on Settings)
public interface IUniversalRepository<T, K>{
//should return the object with the id, reagardless of the column name
public T findById(K id);
}
// two example classes with different #Id fields
public class TaxRate {
#Id
#Column()
private Integer id;
...
}
public class Settings{
#Id
#Column() //cannot rename this column because it has to be named exactly as it is for backup reason
private String randomNumber;
...
}
// the Repository would be used like this
public interface TaxRateRepository extends IUniversalRepository<TaxRate, Integer> {
}
public interface SettingsRepository extends IUniversalRepository<TaxRate, String> {
}
Happy for suggestions.
The idea of retrieving JPA entities via "id query" is not so good as you might think, the main problem is that is much slower, especially when you are hitting the same entity within transaction multiple times: if flush mode is set to AUTO (with is actually the reasonable default) Hibernate needs to perform dirty checking and flush changes into database before executing JPQL query, moreover, Hibernate doesn't guarantee that entities, retrieved via "id query" are not actually stale - if entity was already present in persistence context Hibernate basically ignores DB data.
The best way to retrieve entities by id is to call EntityManager#find(java.lang.Class<T>, java.lang.Object) method, which in turn backs up CrudRepository#findById method, so, yours findByIdAndType(K id, String type) should actually look like:
default Optional<T> findByIdAndType(K id, String type) {
return findById(id)
.filter(e -> Objects.equals(e.getType(), type));
}
However, the desire to place some kind of id placeholder in JQPL query is not so bad - one of it's applications could be preserving order stability in queries with pagination. I would suggest you to file corresponding CR to spring-data project.

How do I stop spring data JPA from doing a SELECT before a save()?

We are writing a new app against an existing database. I'm using Spring Data JPA, and simply doing a
MyRepository.save()
on my new entity, using
MyRepository extends CrudRepository<MyThing, String>
I've noticed in the logs that hibernate is doing a Select before the insert, and that they are taking a long time, even when using the indexes.
I've searched for this here, and the answers I've found usually are related to Hibernate specifically. I'm pretty new to JPA and it seems like JPA and Hibernate are pretty closely intertwined, at least when using it within the context of Spring Data. The linked answers suggest using Hibernate persist(), or somehow using a session, possibly from an entityManager? I haven't had to do anything with sessions or entityManagers, or any Hibernate API directly. So far I've gotten simple inserts done with save() and a couple #Query in my Repositories.
Here is the code of Spring SimpleJpaRepository you are using by using Spring Data repository:
#Transactional
public <S extends T> S save(S entity) {
if (entityInformation.isNew(entity)) {
em.persist(entity);
return entity;
} else {
return em.merge(entity);
}
}
It does the following:
By default Spring Data JPA inspects the identifier property of the given entity. If the identifier property is null, then the entity will be assumed as new, otherwise as not new.
Link to Spring Data documentation
And so if one of your entity has an ID field not null, Spring will make Hibernate do an update (and so a SELECT before).
You can override this behavior by the 2 ways listed in the same documentation. An easy way is to make your Entity implements Persistable (instead of Serializable), which will make you implement the method "isNew".
If you provide your own id value then Spring Data will assume that you need to check the DB for a duplicate key (hence the select+insert).
Better practice is to use an id generator, like this:
#Entity
public class MyThing {
#Id
#GeneratedValue(generator = "uuid2")
#GenericGenerator(name = "uuid2", strategy = "uuid2")
private UUID id;
}
If you really must insert your own id and want to prevent the select+insert then implement Persistable, e.g.
#Entity
public class MyThing implements Persistable<UUID> {
#Id
private UUID id;
#Override
public UUID getId() {
return id;
}
//prevent Spring Data doing a select-before-insert - this particular entity is never updated
#Override
public boolean isNew() {
return true;
}
}
I created a custom method in the #Repository:
public void persistAll(Iterable<MyThing> toPersist) {
toPersist.forEach(thing -> entityManager.persist(thing));
}
If you provide your own ID value then Spring Data will assume that you need to check the DB for a duplicate key (hence the select+insert).
One option is to use a separate autogenerated ID column as Primary key but this option seems redundant. Because if you already have a Business/Natural ID that is unique then it is easier to make this as the #ID column instead of having a separate ID column.
So how to solve the problem?
The solution is to use #javax.persistence.Version on a new versionNumber column in all the tables. If you have a parent and child table then use #Version column in all the entity classes.
Add a column in the Entity class like this:
#javax.persistence.Version
#Column(name = "data_version")
private Long dataVersion;
add column in SQL file:
"data_version" INTEGER DEFAULT 0
Then I see that Spring data does not do Select before doing Insert.

JPA/Hibernate: how to automatically update fields on entity update

I'm using Camel and JPA to persist entities to a Postgres DB. In each entity I have a field called "history" which contains all the old values of the given entity. I'm looking for a way to populate this field automatically before each update operations.
Surfing the web, I've found the JPA interceptors, but I've seen that they are used for auditing/logging purposes. Am I wrong?
What's the best way to do this?
JPA/Hibernate interceptors (which one depends on the version you're using) are one way to do this. Auditing/logging are similar to what you want to do, i.e. automatically update some column/property when the entity itself is updated (any property). Just note that manual update queries circumvent those interceptors so those should be avoided.
How you use those interceptors depends on how you want to implement that history functionality though. If you're doing it by generating some string/byte representation and storing it in a column it should work. If you're planning to create another entity etc. you might have to collect the changes/old values in the interceptor and upon successful commit you store the collected values. AFAIK it's not possible (at least not easy) to create a new entity when the interceptors have been invoked.
#Entity
#Table(name = "entities")
public class Entity {
...
private Date created;
private Date updated;
#PrePersist
protected void onCreate() {
created = new Date();
}
#PreUpdate
protected void onUpdate() {
updated = new Date();
}
}
You can use #EntityListeners and provide your entity Listener class to it, and you can also reuse this whenever you want
In your entity Listener class, you can provide callback methods with #PrePersit, #PostPersist, #PreUpdate, #PostUpdate, #PreDelete, #PostDelete annotations. These methods will get called automatically for their respective actions.
You can read Spring Data JPA Auditing: Saving CreatedBy, CreatedDate, LastModifiedBy, LastModifiedDate automatically for more details.

JPA equivalent of Hibernate's #Generated(GenerationTime.ALWAYS)

When certain non key fields of a entity are generated in the database (for instance, by triggers) a call to persist will not bring back values that the database has just generated. In practice this means that you may need to refresh an entity after persist or merge (and when level 2 cache is enabled you may even need to evict the entity).
Hibernate have a custom annotation #Generated which handles Generated Properties.
// Refresh property 1 on insert and update
#Generated(GenerationTime.ALWAYS)
#Column(insertable = false, updatable = false)
private String property1;
// Refresh property 2 on insert
#Generated(GenerationTime.INSERT)
#Column(insertable = false)
private String property2;
JPA #GeneratedValue only works with primary key properties.
So, my question is if there is a replacement for #Generated on JPA API (maybe on 2.1)? And if there isn't one, what is the best practice to handle non key database generated fields?
I read the specs from the beginning until the end and it is not such thing, nothing comparable with #Generated, sorry , and as you said.
The GeneratedValue annotation may be applied to a primary key property
or field of an entity or mapped superclass in conjunction with the Id
annotation.
What you could do is use Event Listener #PrePersist and #PreUpdate to set some properties by default or generated by utility classes before em persist the object , try that approach it comes to my mind to something similiar.

When migrating entities to HRD #Parent keys become null

I am in the process of moving an existing Google AppEngine application from the master-slave datastore (MSD) to the new high-replication datastore (HRD).
The application is written in Java, using Objectify 3.1 for persistence.
In my old (MSD) application, I have an entity like:
public class Session {
#Id public Long id;
public Key<Member> member;
/* other properties and methods */
}
In the new (HRD) application, I have changed this into:
public class Session {
#Id public Long id;
// HRD: #Parent is needed to ensure strongly consistent queries.
#Parent public Key<Member> member;
/* other properties and methods */
}
I need the Session objects to be strongly consistent with their parent Member object.
When I migrate (a working copy of) my application using Google's HRD migration tool, all Members and Sessions are there. However, all member properties of Session objects become null. Apparently, these properties are not migrated.
I was prepared to re-parent my Session objects, but if the member property is null, that is impossible. Can anyone explain what I am doing wrong, and if this problem can be solved?
#Id and #Parent are not "real" properties in the underlying entity. They are part of the key which defines the entity; Objectify maps them to properties on your POJO.
The transformation you are trying to make is one of the more complicated problems in GAE. Remember that an entity with a different parent (say, some value vs null) is a different entity; it has a different key. For example, loading an entity with a null parent, setting the parent to a value, and saving the entity, does not change the entity -- it creates a new one. You would still need to delete the old entity and update any foreign key references.
Your best bet is to import the data as-is with the regular 'member' field. You can also have the #Parent field (call it anything; you can rename it at any time since it's not a "real" property). After you migrate, make a pass through your data:
Load each Session
Check for null parentMember. If null:
Assign parentMember and save entity
Delete entity with null parentMember
Be very careful of foreign key references if you do this.

Categories