Get PersistentSet from EntityManager or Hibernate Session - java

Hibernate returns org.hibernate.collection.internal.PersistentSet as Set implementation on #OneToMany relation:
#OneToMany(mappedBy = "group", cascade = CascadeType.PERSIST)
private Set<Student> studentSet;
Hibernate tracks all changes on PersitanceSet (if some entity will is added to Set than it will be inserted into a database and etc.). Is it possible to have the same functionality for collections got by JPA EntityManager, org.hibernate.Session or by another way?
For example:
entityManager.createQuery(query, Student.class)
.setParameter("name", name)
.getResultList();
Doesn't return such kind of collection.
So I am searching the way to get elements by custom query and collect elements into a collection that Hibernates tracks all changes(inserting on adding new transient entities, updating on changing managed entities, deleting on removing from the collection)

What you're asking for is not possible in Hibernate.
Hibernate tracks all changes on PersitanceSet (if some entity will is added to Set than it will be inserted into a database and etc.)
That statement is not really accurate. Hibernate will not automatically insert an entity added to the set into a database. You need to opt in for that functionality specifically by declaring the appropriate cascading option (CascadeType.PERSIST in this case).
What Hibernate will do, however, is track associations between entities. If a collection represents the owning side of a to-many association, changes to the collection will establish/destroy associations between entities. In fact, Hibernate will track all other entity state, not just associations. That's the idea behind managed entities - to be able to work with a domain object just like with any other Java object, and let Hibernate take care about persistence behind the scenes.
A collection retrieved from a query does not represent part of a single entity state. Therefore, there would be little sense for Hibernate to track the structural state of the list. Suppose you made two queries for the same data within a single transaction. You then modify one of the result lists and leave the other intact. What do you think should happen in such a scenario?
Note that by 'not possible', I mean to say that Hibernate does not provide such a functionality out of the box. However, if you want to track changes to an arbitrary list, there are list implementations that allow that (see e.g. Glazed Lists or Apache Commons Events). You could combine them with Hibernate API to get the behavior you want.

Related

Maintaining relationships in JPA 2.0

I've been using JPA 2.0 for a while but, sad to admit, I haven't had enough time to learn it properly. It seems like I lack the basics of how to work with Entity Manager.
Moving one step at a time, I'd like to first ask you about maintaining relationships between mapped entities. Of course I know how to create mappings between entities, different types of available associations (OneToOne, etc.) and how databases work in general. I'm purely focused on maintaining it via Entity Manager, so please do not send me to any kind of general knowledge tutorial :-).
The questions are:
Am I right that as a programmer I'm responsible for maintaining (creating/updating/removing) relationships between instances of entities?
Do I have to always update (set to null, remove from collection, etc.) instances by hand?
Plain SQL can set entities to NULL on deleting, but it seems like JPA can't do such a simple thing. It also seems like a burden to do it manually. Is there a way to achieve that with JPA?
If I have OneToMany relationship and set to NULL the entity on the Many side of the relationship. Then I persist the changes in a Set by saving the entity on the One side. Do I then have to update the entities in the Many side and set association to NULL in each instance? Seems pure silliness for one-directional bindings!
Thanks in advance!
The main thing you need to investigate is the different options you have when mapping on entity. For example in the next piece of code the cascade all option will instruct jpa to delete the child list when the parent is deleted.
#OneToMany(fetch = FetchType.LAZY, cascade = { CascadeType.ALL }, mappedBy = "parent")
private Set<Child> events = new HashSet<Child>();
Yes. You maintain the object tree and modify it to look like what
you want.
Yes and no. If you want the entity to reference null, then yes.
For instance, if you are removing one Entity, then you should clean
up any references to it held by other entities that you are not
removing. A practical example: its good practice to let an Employee
know his/her Manager has been let go. If the Employee is going to
stay, it should either have its manager reference nulled out or set
to a different manager, before the current manager can be removed.
If the employee is going to be removed as well, then cascade remove
can cascade to all the Manager's subordinates, in which case you do
not need to clean up their references to the manager - as they are
going away too.
I don't quite understand what SQL is setting to null. Deleting
removes the row in the database, so there isn't anything to set to
null. Cleaning up a reference shouldn't be that difficult in the
object model, as JPA has a number of events to help such as
preremove preupdate etc. In the end though, the problem is with
your java objects. They are just java objects, so if you want
something done, your application will need to do it for the most
part. JPA handles building them and pushing them to the database,
not changing the state for you.
Yes and no. If you set up a bidirectional relationship, you must
maintain both sides as mentioned above. If you set the child's
parent reference to null, you should let the parent know it no
longer has a child, wouldn't you? Your parent will continue to
reference its child for as long as that Parent instance exists. So
even though the database is updated/controlled through the side that
owns a relationship, the object model will be out of synch with the
database until it is refreshed or somehow reloaded. JPA allows for
multiple levels of caching, so it all depends on your provider setup
how long that Parent instance will exist referencing a Child that no
longer exists in the database.

How one to many relationship gets persisted in JPA if i have thousands of related entities already in data base and i add new entities in collection

We have two entities Entity1 and Entity2, where Entity1 contains set of Entity2,
we already have thousands of entities stored in database of entity type Entity2 which all are referenced from an instance of Entity1, say myEntity.
Now if i add more Entity2 entities to the collection and try to persist myEntity, where newly added entities of Entity2 are already persisted.
My question is how will be the behavior on persist of myEntity , whether existing members of relation will travel to memory and new members will be added or new members are added to database without bringing existing members to memory
If you have thousands of referenced entities, it might be better not to map the relationship and instead only query for it when needed - allowing you to use paging or other mechanisms to reduce the amount of entities read in at a time. It depends on what type of mapping it is, but only the owning relationship needs to be mapped (the one that doesn't have the mapped by) to set the foreign key in the database. Set the Entity2 side to be the owning side if it isn't already.
If this is a M:M with a relation table and doesn't make sense to map from the Entity2 side instead - you could add an entity for the relation table that you would read in the same way. The new entity would have a reference to Entity1, but Entity1 wouldn't reference it, and the app would query for the new entity when it needs to get Entity2s associated to a specific Entity1.
If you want to add new instances to a relation between two already existent entities (a one to many in this case) then you must first fetch from the database the entity that contains the collection; in your case myEntity.
So, when you load that entity you are bringing it to memory. If you had defined the relation between those two as EAGER then all the related entities (the ones in the collection) will be fetched as well at the same time than the parent one. If you, otherwise, had defined the relation as LAZY then the related entities will be loaded when you access the collection (in other, words, when you invoke the getter getXXX method for that collection).
This happens that way because JPA implementations (now I'm thinking on Hibernate) return proxies of the entities instead of actual instances so they can intercept the getter/setter method calls and perform any tracking on the state of the entities.
Right, so now you want to add more instances to the relation. It doesn't matter whether the relation is EAGERor LAZY in this case as you will eventually invoking the getter method of the collection in order to be able to perform add(myNewEntity); on it. So, the already existent entities are in the collection and you are just adding a (probably) untracked entity under the collection implementation semantics.
When persisting myEntity back to the database the JPA implementation will know which instances of the actual collection need either an update, a delete or an insert. If you just added new instances then just insert statements will be issued but you could also remove an entity from the collection or change the state (invoke the setter) of an already existent instance. JPA implementations are able to recognise those operations and issue the appropriate SQL statements to keep the database up to date.

JPA mapping for inserting a relationship

I am using JPA to persist my entities. Suppose I have a ManyToMany relationship between the entity A and B. So, in the class A it will be a List<B> and in B a List<A>.
My question is about the efficiency of adding a new relationship.
In the easy way I can make a add(new B()) in the list in class A. Will this List of B objects in the class A will all be loaded from the database when I call the add method in runtime? Is this efficient?
If I have 200 relationships, it will load all of them, to simply add a new object B? It will be better to create a native query to insert a new row the ManyToMany table.
The behavior of add() on a List is dependent on the JPA provider.
For EclipseLink, for a LAZY List relationship add() will not cause the list to be fetched by default.
Usually, JPA providers provide their own implementations of Java Collections which support lazy loading, so not all relationships will be loaded right from the start.
Also, when you modify any of the lists as you described, the JPA provider should transparently update the database. You don't have to worry to keep things consistent, other than references you might have cached on your own.
When it starts to run into the hundreds you might be better off to just map the join table as a JPA entity and not use a ManyToMany mapping but simply perform operations (mutations and selections) directly on the join table with two ManyToOne mappings in there.

How does Envers deal with schema changes?

I am thinking about switching from a self-implemented versioning-solution to Hibernate Envers, but I am not quite sure yet. I have read a lot about it, but I am concerned about schema changes and how Envers deals with them after having historized data according to an older schema.
What is your experience with Envers in this regard? How do you deal with schema changes and existing data with Envers?
Update 1:
It is not just about adding removing simple columns from a table, but e.g. when changing a simple Forein-Key-Relationship into a separate entity with two 1:n-relationships (M2M with attributed columns. This is a "logical" change in your data model. How do you deal with that when using Envers, when there is already historized data according to the old model? Is there an alternative to manually write sql-scripts and transfering them into the new representation?
In my experience, Envers simply copies every field from your entity table to its audit tables. The copied fields in the audit tables have no constraints on them, including nullability and foreign key constraints, so there's no problem with adding or removing such constraints on the real tables. Any kind of relationships you add to your entities will just be new audit columns and/or tables added under Envers, and it's up to you to correctly interpret them in their historical context.
For your example, if I understand correctly, of switching from a join-column-based relationship to a join-table-based one, you'd simply have the old join column coexisting with the join table, and at the point of the cutover, the former will cease being populated in favor of the latter. Your history will be completely preserved, including the fact that you made this switch. If you want all the old data to fit into the new model in the audit tables, it's up to you to do the migration.
There shouldn't be problems with modifying the existing schema as Envers relies on your #Entities to create the audit tables. So if you add or remove a column from an existing table, as long as this change is reflected in your #Entity / #Audited JavaBean, it should be ok.
The foreign key refactoring should be fine with Envers. As Envers creates a join table even for one-to-many relationship, it should be straight to change it to become many-to-many relationship. I extracted one paragraph from official document:
9.3. #OneToMany+#JoinColumn
When a collection is mapped using these two annotations, Hibernate
doesn't generate a join table. Envers, however, has to do this, so
that when you read the revisions in which the related entity has
changed, you don't get false results.
To be able to name the additional join table, there is a special
annotation: #AuditJoinTable, which has similar semantics to JPA's
#JoinTable.
One special case are relations mapped with #OneToMany+#JoinColumn on
the one side, and #ManyToOne+#JoinColumn(insertable=false,
updatable=false) on the many side. Such relations are in fact
bidirectional, but the owning side is the collection (see alse here).
To properly audit such relations with Envers, you can use the
#AuditMappedBy annotation. It enables you to specify the reverse
property (using the mappedBy element). In case of indexed collections,
the index column must also be mapped in the referenced entity (using
#Column(insertable=false, updatable=false), and specified using
positionMappedBy. This annotation will affect only the way Envers
works. Please note that the annotation is experimental and may change
in the future.

JPA cascade options at runtime

I'm trying to make an application that keeps an object model in sync with a database by observing all changes and then immediately persisting the objects in questions. Many of the object in the model have children in large lists or trees.
When I load an object from the database, I rely on a one-way cascading relationship to also retrieve all of its children and include them in the application.
However, it is possible to alter a field in the parent object which requires persistence and I can determine that none of the children are affected. So I would like to persist the parent, without hitting the database with all the cascaded child persists.
eg
#Entity
public class Parent {
#OneToMany(cascade=CascadeType.ALL)
public List children;
}
How can I override the cascade option when I persist a Parent object? Or should I just set it to REFRESH and make sure I never need a cascading persist?
Reading the objects from the database and persisting them rely upon two different annotations.
When you load an object, it will also get the other end of any eager (FetchType.EAGER) relationships, as defined by the fetch property on the relationship.
Depending on your JPA provider, you may have options to override this behaviour. EclipseLink, via the incredibly useful QueryHint.BATCH, certainly does.
When you persist, delete or refresh, the cascade type is what's relevant.
So, lose the cascade, keep the fetch and problem solved.
Personally I think cascade all is asking for trouble but opinions will vary.
A decent JPA provider will have a pretty sophisticated (configurable) caching scheme already. Perhaps you should be asking why you're reinventing that particular wheel?
Is it an issue of asynchronous updates purely for performance? Or is something else the reason?

Categories