Hibernate/JPA cache lookup values - java

We have JPA entities representing lookup values (states, country codes, etc). Methods that are called frequently to get Lists of these values are cached using the org.springframework.cache.annotation.Cacheable annotation where appropriate.
We also have entities that have relationships with these lookup entities defined like:
#Entity
#Table(name = "Address")
public class AddressEntity {
// ...
#ManyToOne(fetch = FetchType.LAZY)
#JoinColumn(name = "STATE_CD", referencedColumnName = "CD")
#NotNull
private StateEntity state;
// ...
}
When we load one of these entities and then call the getter on the associated lookup, Hibernate hits the database again to load that value. We'd like to make it so when we have an address and we do a getState on that address, we hit a local cache for that information. How can we do that with Hibernate/JPA?
// Get address:
Address address = addressRepo.findOne(addressId);
// Get the state - this causes an additional query to hit the database:
State state = address.getState();

The fetch type does not matter here. Hibernate's second-level cache behavior is to cache the ids of to-one association targets rather than the targets themselves.
Why not make StateEntity itself #Cacheable? It seems a very good candidate, as there should be (much) fewer instances of StateEntity than AddressEntity

Related

Hibernate Second level cache doesn't work for OneToOne associations

I am trying to enable Hibernate's 2nd level cache but cannot avoid multiple queries being issued for OneToOne relations.
My models are:
#Entity
#Cache(usage = CacheConcurrencyStrategy.READ_WRITE)
public class Business {
#OneToOne(mappedBy = "business", cascade = {CascadeType.REMOVE}, fetch = FetchType.EAGER)
private Address address;
}
#Entity
#Cache(usage = CacheConcurrencyStrategy.READ_WRITE)
public class Address {
#ManyToOne(fetch = FetchType.EAGER)
#JoinColumn(name = "business_id", unique = true, nullable = false, foreignKey = #ForeignKey(name = "fk_business_id"))
private Business business;
}
When I run session.get(Business.class, id) with the Business with id id in the cache, no query is issued for loading Business but it does for Address.
I understand that Address is the relation owner and that in the Business cache entry there's no Address.id information, but wouldn't it be possible to solve this problem by applying the same mechanism as *ToMany relations does, creating a new cache region for each field? Assuming Business 1 is related to Address 2, there would be the following regions and entries in my cache after a first load:
Business
Business#1 -> [business model]
Business.address
Business.address#1 -> [2]
Address
Address#2 -> [address model]
I have tried to make it work by annotating Address.business with #NaturalId and the Address class with #NaturalIdCache. The cache region is created and populated but session.get(Business.class, id) does not use it.
My Business model has many more OneToOne relations whose foreign key is on the other side (not the Business) and we must list several at a time so the database server has to process dozens of queries per HTTP request.
I have read the Hibernate's User Guide, Vlad Mihalcea's explanation on 2LC and its in-memory dehydrated format, Baeldung's explanation and several other StackOverflow answers and cannot find a way to solve this.

Detach entity within entity from persistence context in Hibernate

Hibernate persists modified entities at the of transactional methods, I can avoid by using session#evict(entity).
If I detach it from the persistence context, the entities whithin it will also be detached?
For instance, I have this classes:
#Entity
public class User extends BaseEntity{
#Column(name = "email")
private String email;
#OneToMany(fetch = FetchType.LAZY, mappedBy = "user")
private List<Address> addresses;
// getters and setters
}
#Entity
public class Address extends BaseEntity{
#Column(name = "email")
private String email;
#ManyToOne(fetch = FetchType.LAZY)
#JoinColumn(name = "USER_ID")
private User user;
// getters and setters
}
If I detach a user object, but change the address object in it, will the address be persisted at the end of transaction? Like this:
User user = userDAO.getById(id);
session.evict(user);
Address address = user.getAddresses().get(0);
address.setNumber(number);
addressDAO.saveOrUpdate(address); //will this work?
Entities that are updated or deleted using a EntityManager.createQuery() are not loaded into the Persistence Context, this only happens for select queries, and when you use find()or merge().
After you do an update or delete query your persistence context may actually be out-of-sync with the database, because the query doesn't update the entities which has already been loaded into the persistence context (you need to call refresh() to see the changes).
If you load a number of user (into the persistence context), and later doUpdate User set status='active' where id IN (:ids), then you have not modified any of the users in the persistence context, you have only modified the database. To modify a user, you must modify the actually managed Entity by calling `aUser.setStatus('active'), when the transaction commits, JPA will check all managed entities against a copy created when it was loaded, and if anything has changed it will do an Update.
If you are loading 5000 objects into the Persistence it may take some time for JPA to run though the entity graph, and detect the changes when the transaction commits. If you didn't modify anything, and would like to speed up the change-detection, there are two ways to do this. Load your entities using a read-only query, this tells JPA that it does not need to keep a copy of the loaded entity. The other option is to call EntityManager.clear() to throw away all managed entities. However, if you are interested in performance, the best solution is probably to avoid loading the entities into the persistence context. As I understand you problem, you need to do a Update User set ... where id IN (:ids)and for that you only need the user's id so you don't need to load the user, you just need the ids, and therefore you can do List<Long> ids = em.createQuery("select u.id from User u where ...", Long.class).getResultList();
Hope this clarifies things for you :)
EDIT: this is written from a JPA perspective, but for hibernate EntityManager just forwards directly to SessionImpl, so the behavior is exactly as described, except for find() being called get()in native Hibernate.
Since JPA 2.0
given an EntityManager you can call detach with the entity you want to be detached as parameter
void detach(Object entity)
more here
if you use injection then you can inject an EntityManger in the service where you want to detach the required entity.

N + 1 when ID is string (JpaRepository)

I have an entity with string id:
#Table
#Entity
public class Stock {
#Id
#Column(nullable = false, length = 64)
private String index;
#Column(nullable = false)
private Integer price;
}
And JpaRepository for it:
public interface StockRepository extends JpaRepository<Stock, String> {
}
When I call stockRepository::findAll, I have N + 1 problem:
logs are simplified
select s.index, s.price from stock s
select s.index, s.price from stock s where s.index = ?
The last line from the quote calls about 5K times (the size of the table). Also, when I update prices, I do next:
stockRepository.save(listOfStocksWithUpdatedPrices);
In logs I have N inserts.
I haven't seen similar behavior when id was numeric.
P.S. set id's type to numeric is not the best solution in my case.
UPDATE1:
I forgot to mention that there is also Trade class that has many-to-many relation with Stock:
#Table
#Entity
public class Trade {
#Id
#GeneratedValue(strategy = GenerationType.AUTO)
private Integer id;
#Column
#Enumerated(EnumType.STRING)
private TradeType type;
#Column
#Enumerated(EnumType.STRING)
private TradeState state;
#MapKey(name = "index")
#ManyToMany(fetch = FetchType.EAGER)
#JoinTable(name = "trade_stock",
joinColumns = { #JoinColumn(name = "id", referencedColumnName = "id") },
inverseJoinColumns = { #JoinColumn(name = "stock_index", referencedColumnName = "index") })
private Map<String, Stock> stocks = new HashMap<>();
}
UPDATE2:
I added many-to-many relation for the Stock side:
#ManyToMany(cascade = CascadeType.ALL, mappedBy = "stocks") //lazy by default
Set<Trade> trades = new HashSet<>();
But now it left joins trades (but they're lazy), and all trade's collections (they are lazy too). However, generated Stock::toString method throws LazyInitializationException exception.
Related answer: JPA eager fetch does not join
You basically need to set #Fetch(FetchMode.JOIN), because fetch = FetchType.EAGER just specifies that the relationship will be loaded, not how.
Also what might help with your problem is
#BatchSize annotation, which specifies how many lazy collections will be loaded, when the first one is requested. For example, if you have 100 trades in memory (with stocks not initializes) #BatchSize(size=50) will make sure that only 2 queries will be used. Effectively changing n+1 to (n+1)/50.
https://docs.jboss.org/hibernate/orm/4.3/javadocs/org/hibernate/annotations/BatchSize.html
Regarding inserts, you may want to set
hibernate.jdbc.batch_size property and set order_inserts and order_updates to true as well.
https://vladmihalcea.com/how-to-batch-insert-and-update-statements-with-hibernate/
However, generated Stock::toString method throws
LazyInitializationException exception.
Okay, from this I am assuming you have generated toString() (and most likely equals() and hashcode() methods) using either Lombok or an IDE generator based on all fields of your class.
Do not override equals() hashcode() and toString() in this way in a JPA environment as it has the potential to (a) trigger the exception you have seen if toString() accesses a lazily loaded collection outside of a transaction and (b) trigger the loading of extremely large volumes of data when used within a transaction. Write a sensible to String that does not involve associations and implement equals() and hashcode() using (a) some business key if one is available, (b) the ID (being aware if possible issues with this approach or (c) do not override them at all.
So firstly, remove these generated methods and see if that improves things a bit.
With regards to the inserts, I do notice one thing that is often overlooked in JPA. I don't know what Database you use, but you have to be careful with
#GeneratedValue(strategy = GenerationType.AUTO)
For MySQL I think all JPA implementations map to an auto_incremented field, and once you know how JPA works, this has two implication.
Every insert will consist of two queries. First the insert and then a select query (LAST_INSERT_ID for MySQL) to get the generated primary key.
It also prevents any batch query optimization, because each query needs to be done in it's own insert.
If you insert a large number of objects, and you want good performance, I would recommend using table generated sequences, where you let JPA pre-allocate IDs in large chunks, this also allows the SQL driver do batch Insert into (...) VALUES(...) optimizations.
Another recommendation (not everyone agrees with me on this one). Personally I never use ManyToMany, I always decompose it into OneToMany and ManyToOne with the join table as a real entity. I like the added control it gives over cascading and fetch, and you avoid some of the ManyToMany traps that exist with bi-directional relations.

JPA many-to-one relation - need to save only Id

I have 2 classes: Driver and Car. Cars table updated in separate process. What I need is to have property in Driver that allows me to read full car description and write only Id pointing to existing Car. Here is example:
#Entity(name = "DRIVER")
public class Driver {
... ID and other properties for Driver goes here .....
#ManyToOne(fetch=FetchType.LAZY)
#JoinColumn(name = "CAR_ID")
private Car car;
#JsonView({Views.Full.class})
public Car getCar() {
return car;
}
#JsonView({Views.Short.class})
public long getCarId() {
return car.getId();
}
public void setCarId(long carId) {
this.car = new Car (carId);
}
}
Car object is just typical JPA object with no back reference to the Driver.
So what I was trying to achieve by this is:
I can read full Car description using detailed JSON View
or I can read only Id of the Car in Short JsonView
and most important, when creating new Driver I just want to pass in JSON ID of the car.
This way I dont need to do unnesessery reads for the Car during persist but just update Id.
Im getting following error:
object references an unsaved transient instance - save the transient instance before flushing : com.Driver.car -> com.Car
I dont want to update instance of the Car in DB but rather just reference to it from Driver. Any idea how to achieve what I want?
Thank you.
UPDATE:
Forgot to mention that the ID of the Car that I pass during creation of the Driver is valid Id of the existing Car in DB.
You can do this via getReference call in EntityManager:
EntityManager em = ...;
Car car = em.getReference(Car.class, carId);
Driver driver = ...;
driver.setCar(car);
em.persist(driver);
This will not execute SELECT statement from the database.
As an answer to okutane, please see snippet:
#JoinColumn(name = "car_id", insertable = false, updatable = false)
#ManyToOne(targetEntity = Car.class, fetch = FetchType.EAGER)
private Car car;
#Column(name = "car_id")
private Long carId;
So what happens here is that when you want to do an insert/update, you only populate the carId field and perform the insert/update. Since the car field is non-insertable and non-updatable Hibernate will not complain about this and since in your database model you would only populate your car_id as a foreign key anyway this is enough at this point (and your foreign key relationship on the database will ensure your data integrity). Now when you fetch your entity the car field will be populated by Hibernate giving you the flexibility where only your parent gets fetched when it needs to.
You can work only with the car ID like this:
#JoinColumn(name = "car")
#ManyToOne(targetEntity = Car.class, fetch = FetchType.LAZY)
#NotNull(message = "Car not set")
#JsonIgnore
private Car car;
#Column(name = "car", insertable = false, updatable = false)
private Long carId;
That error message means that you have have a transient instance in your object graph that is not explicitly persisted. Short recap of the statuses an object can have in JPA:
Transient: A new object that has not yet been stored in the database (and is thus unknown to the entitymanager.) Does not have an id set.
Managed: An object that the entitymanager keeps track of. Managed objects are what you work with within the scope of a transaction, and all changes done to a managed object will automatically be stored once the transaction is commited.
Detached: A previously managed object that is still reachable after the transction commits. (A managed object outside a transaction.) Has an id set.
What the error message is telling you is that the (managed/detached) Driver-object you are working with holds a reference to a Car-object that is unknown to Hibernate (it is transient). In order to make Hibernate understand that any unsaved instances of Car being referenced from a Driver about be saved should also be saved you can call the persist-method of the EntityManager.
Alternatively, you can add a cascade on persist (I think, just from the top of my head, haven't tested it), which will execute a persist on the Car prior to persisting the Driver.
#ManyToOne(fetch=FetchType.LAZY, cascade=CascadeType.PERSIST)
#JoinColumn(name = "CAR_ID")
private Car car;
If you use the merge-method of the entitymanager to store the Driver, you should add CascadeType.MERGE instead, or both:
#ManyToOne(fetch=FetchType.LAZY, cascade={ CascadeType.PERSIST, CascadeType.MERGE })
#JoinColumn(name = "CAR_ID")
private Car car;
public void setCarId(long carId) {
this.car = new Car (carId);
}
It is actually not saved version of a car. So it is a transient object because it hasn't id. JPA demands that you should take care about relations. If entity is new (doesn't managed by context) it should be saved before it can relate with other managed/detached objects (actually the MASTER entity can maintain it's children by using cascades).
Two ways: cascades or save&retrieval from db.
Also you should avoid set entity ID by hand. If you do not want to update/persist car by it's MASTER entity, you should get the CAR from database and maintain your driver with it's instance. So, if you do that, Car will be detached from persistence context, BUT still it will have and ID and can be related with any Entity without affects.
Add optional field equal false like following
#ManyToOne(optional = false) // Telling hibernate trust me (As a trusted developer in this project) when building the query that the id provided to this entity is exists in database thus build the insert/update query right away without pre-checks
private Car car;
That way you can set just car's id as
driver.setCar(new Car(1));
and then persist driver normal
driverRepo.save(driver);
You will see that car with id 1 is assigned perfectly to driver in database
Description:
So what make this tiny optional=false makes may be this would help more https://stackoverflow.com/a/17987718
Here's the missing article that Adi Sutanto linked.
Item 11: Populating a Child-Side Parent Association Via Proxy
Executing more SQL statements than needed is always a performance penalty. It is important to strive to reduce their number as much as possible, and relying on references is one of the easy to use optimization.
Description: A Hibernate proxy can be useful when a child entity can be persisted with a reference to its parent ( #ManyToOne or #OneToOne lazy association). In such cases, fetching the parent entity from the database (execute the SELECT statement) is a performance penalty and a pointless action. Hibernate can set the underlying foreign key value for an uninitialized proxy.
Key points:
Rely on EntityManager#getReference() In Spring
use JpaRepository#getOne() Used in this example,
in Hibernate, use load()
Assume two entities, Author and Book, involved in a unidirectional #ManyToOne association (Author is the parent-side) We fetch the author via a proxy (this will not trigger a SELECT), we create a new book
we set the proxy as the author for this book and we save the book (this will trigger an INSERT in the book table)
Output sample:
The console output will reveal that only an INSERT is triggered, and no SELECT
Source code can be found here.
If you want to see the whole article put https://dzone.com/articles/50-best-performance-practices-for-hibernate-5-amp into the wayback machine. I'm not finding a live version of the article.
PS. I'm currently on a way to handle this well when using Jackson object mapper to deserialize Entities from the frontend. If you're interested in how that plays into all this leave a comment.
Use cascade in manytoone annotation
#manytoone(cascade=CascadeType.Remove)

Adding an entity into an large Many-To-Many relationship in JPA

I have a Group entity that has a list of User entities in a many to many relationship. It is mapped by a typical join table containing the two IDs. This list may be very large, a million or more users in a group.
I need to add a new user to the group, typically that will be something like
group.getUsers().add(user);
user.getGroups().add(group);
em.merge(group);
em.merge(user);
If I understand typical JPA operation, will this require pulling down the entire list of 1 million+ users into the collection in order to add the new user and then save? That doesn't sound very scalable to me.
Should I simply not be defining this relationship in JPA? Should I be manipulating the join table entries directly in a case like this?
Please forgive the loose syntax, I'm actually using Spring Data JPA so I don't directly use the entity manager directly very often, but the question seems to be general to JPA so I wanted to pose it that way.
Design your models like this and play with UserGroup for associations.
#Entity
public class User {
#OneToMany(cascade = CascadeType.ALL, mappedBy = "user",fetch = FetchType.LAZY)
#OnDelete(action = OnDeleteAction.CASCADE)
private Set<UserGroup> userGroups = new HashSet<UserGroup>();
}
#Entity
#Table(name="user_group",
uniqueConstraints = {#UniqueConstraint(columnNames = {"user_id", "group_id"})})
public class UserGroup {
#ManyToOne(fetch = FetchType.LAZY)
#JoinColumn(name = "user_id", nullable = false)
#ForeignKey(name = "usergroup_user_fkey")
private User user;
#ManyToOne(fetch = FetchType.LAZY)
#JoinColumn(name = "group_id", nullable = false)
#ForeignKey(name = "usergroup_group_fkey")
private Group group;
}
#Entity
public class Group {
#OneToMany(cascade = CascadeType.ALL, mappedBy="group", fetch = FetchType.LAZY )
#OnDelete(action = OnDeleteAction.CASCADE)
private Set<UserGroup> userGroups = new HashSet<UserGroup>();
}
Do like this.
User user = findUserId(id); //All groups wont be loaded they are marked lazy
Group group = findGroupId(id); //All users wont be loaded they are marked lazy
UserGroup userGroup = new UserGroup();
userGroup.setUser(user);
userGroup.setGroup(group);
em.save(userGroup);
Using the ManyToMany mapping effectively is caching the collection in the entity, so you might not want to do this for large collections, as displaying it or passing the entity around with it triggered will kill performance.
Instead you might remove the mapping on both sides, and create an entity for the relation table that you can use in queries when you do need to access the relationship. Using an intermediate entity will allow you to use paging and cursors, so that you can limit the data that might be brought back into usable chunks, and you can insert a new entity to represent new relationships with ease.
EclipseLink's attribute change tracking though does allow adding to collections without the need to trigger the relationship, as well as other performance enhancements. This is enabled with weaving and available on collection types that do not maintain order.
The collection classes returned by getUsers() and getGroups() don't have to have their contents resident in memory, and if you have lazy fetching turned on, as I assume you do for such a large relationship, the persistence provider should be smart enough to recognize that you're not trying to read the contents but just adding a value. (Similarly, calling size() on the collection will typically cause a SQL COUNT query rather than actually loading and counting the elements.)

Categories