How do i properly cache common objects / entities when using JPA?

How do i properly cache common objects / entities when using JPA? - java

I have a web app that needs to load some rows from tables in my database as objects that will be re-used in various places within the application.
For instance, I might retrieve a list of locations that change infrequently, so I want to store those in memory for quicker access later on.
Next, in a future request, I want to associate one of these entities to another entity in the database; however, JPA throws errors when attempting to save, because the cached entity is detached.
How do I reattach the cached entity so I can include it in the current transaction?
I'm using the Apache OpenJPA implementation.

Calling EntityManager.merge() will re-attach a Detached object.
Have a look at this informative page

Related

Caching de-attached Hibernate entities in external Cache

I have a set of very heavy queries whose result I want to cache into an external Cache implementation (cache the whole object list not just ids like in Hibernate's 2nd level cache).
The issue is that due to the lazy loading of several collections in the root object, once the session that queried the results is done, the objects become de-attached and the next request that tries to use the object might throw a LazyLoading exception.
Environment: Spring 4, Hibernate 4.3, Ehcache.
Is there any way to be able to re-attach the object to a new session without having it modify the underlaying DB (like with merge and update)?

There is no way to reattach a detached entity to a session just to load a lazy-initialized collection.
In order to get an updated copy of a persistent object without overwriting the session / calling merge, it's necessary to call either EntityManager.find() or do a query.
This is because the main goal of the session is to keep the database and the objects in memory in sync. Due to this there is no API for attaching new state without persisting it, as this is not in line with the main functionality of the session.
The 2nd level cache, if configured together with the query cache can solve the problem of caching the entities, queries and their associations in a much better way than any custom solution.
Everything can get cached to the point that no query hits the database. The two caches really go together, check this blog post for further info.

after performing an merge on the detached object in hibernate in the current session will the changes on the object be tracked?

In a container managed transaction i get a detached object and merge it so that the detached object is brought to managed state.My initial question is by caching the Pojo java objects and merging is a better idea to get the object into session or performing the get of the data from the DB to get in to session context a better idea in terms of cost of operation/time involved in getting the data from the DB?If i am performing an merge at start to get the object into the session context and doing the modification on this merged object will the hibernate take care of generating all the required sql statements and at the end will it be taken care ?
Please comment back which is better approach to get the entity to session , using a merge of the cached detached object or fetching the data from the DB is lesser time consumption?

when you call detach and then merge, merge returns you the attached entity in the context. it's a common mistake that users would use the passed entity after merge operation hoping that would be managed but this is not the case. you have to use the returned entity from merge which will be managed by hibernate and subsequent changes will be flushed at transaction end automatically.
it doesnt matter much when u load your entity coz hibernate will anyways fire a select if it is already not loaded in the context. also even if you keep on doing changes to your managed entity, hibernate will fire update only when you exit your transaction or call flush() explicitly.

Copy the state of the given object onto the persistent object with the same identifier. If there is no persistent instance currently associated with the session, it will be loaded. Return the persistent instance. If the given instance is unsaved, save a copy of and return it as a newly persistent instance. The given instance does not become associated with the session. This operation cascades to associated instances if the association is mapped with cascade="merge".
According to the API it saves a copy when you perform the merge and then returns a new instance. Based on my experience its always better to merge at the end after you have performed all the updates on the objects in detached state. Its better because you will call merge operation only at the end when the object state is ready to be persisted.
Also this will perform better because the object is moved to persistent context at the end and hence Hibernate will not have to come into picture till the end.

How long are Entities persited?

I am trying to understand how JPA works. From what I know, if you persist an Entity, that object will remain in the memory until the application is closed. This means, that when I look for a previously persisted entity, there will be no query made on the database. Assuming that no insert, update or delete is made, if the application runs long enough, all the information in it might become persistent. Does this mean that at some point, I will no longer need the database?
Edit
My problem is not with the database. I am sure that the database can not be modified from outside the application. I am managing transactions by myself, so the data gets stored in the database as soon as I commit. My question is: What happens with the entities after I commit? Are they kept in the memory and act like a cache? If so, how long are they kept there? After I commit a persist, I make a select query. This select should return the object I persisted before. Will that object be brought from memory, or will the application query the database?

Not really. Think about it.
Your application probably isn't the only thing that will use the database. If an entity was persisted once and stored in memory, how can you be sure that, let's say, one hour later, it won't be changed by some other means? If that happens, you will have stale data that can harm logic of your application.
Storing data in memory and hoping that everything will be alright won't bring any benefits. That's why data stored in database is your primary source of information, and you should query it every time, unless you are absolutely sure that a subset of data won't change.

When you persist an entity an entity this will add it to the persistence context which acts like a first level cache (this is in-memory). When the actual persisting happens depends on whether you use container managed transactions or deal with transactions yourself. The entity instance will live in memory as long as the transaction is not commited, and when it is it will be persisted to the database or XML etc.

JPA can't work with only the persistence context (L1 cache) or the explicit cache (L2 cache). It always needs to be combined with a datasource, and this datasource typically points to a database that persists to stable storage.
So, the entity is in memory only as long as the transaction (which is required for JPA persist operations) isn't committed. After that it's send to the datasource.
If the transaction manager is transaction scoped (the 'normal' case) then the L1 cache (the persistence context) is closed and the entities do not longer exist there. If the L1 cache somehow bothers you, you can manage it a bit explicitly. There are operations to clear it and you could separate your read operations (which don't need transactions) from write operations. If there's no transaction active when reading, there's no persistence context, an entity becomes never attached and is thus never put into this L1 cache.
The L2 cache however is not cleared when the transaction commits and entities inside it remain available for the entire application. This L2 cache must be explicitly configured and you as an application developer must indicate which entities should be cached in it. Via vendor specific mechanisms (e.g. JBoss Cache, Infinispan) you can put a max on the number of entities being cached and set/define so-called eviction policies.
Of course, nothing prevents you from letting the datasource point to an in-memmory embedded DB, but this is outside the knowledge of JPA.

Persistence means in short terms: you can shut down your app, and the data is not lost.
To achieve that you need a database or some sort of saving data in a way that it's not lost when you shut down the app.

To "persist" an entity means to actually save it in the data base. Sure, JPA maintains some entity information in memory in the persistence context (and this is highly dependent on configuration and programming practices), but at certain point information will be stored in the data base - for instance, when a transaction commits, or likely (but not necessarily) after flush() or merge() operations.

If you want to keep your entities after committing and for a select query, you need to use the query cache. Just Google around on that term and it should be clear to you.

JPA merge vs. persist [duplicate]

This question already has answers here:
JPA EntityManager: Why use persist() over merge()?
(16 answers)
Closed 2 years ago.
So far, my preference has been to always use EntityManager's merge() take care of both insert and update. But I have also noticed that merge performs an additional select queries before update/insert to ensure record does not already exists in the database.
Now that I am working on a project requiring extensive (bulk) inserts to the database. From a performance point of view does it make sense to use persist instead of merge in a scenario where I absolutely know that I am always creating a new instance of objects to be persisted?

It's not a good idea using merge when a persist suffices - merge does quite a lot more of work. The topic has been discussed on StackOverflow before, and this article explains in detail the differences, with some nice flow diagrams to make things clear.

I would definitely go with persist persist() if, as you said:
(...) I absolutely know that I am always creating a new instance of objects to be persisted (...)
That's what this method is all about - it will protect you in cases where the Entity already exists (and will rollback your transaction).

If you're using the assigned generator, using merge instead of persist can cause a redundant SQL statement, therefore affecting performance.
Also, calling merge for managed entities is also a mistake since managed entities are automatically managed by Hibernate and their state is synchronized with the database record by the dirty checking mechanism upon flushing the Persistence Context.
To understand how all this works, you should first know that Hibernate shifts the developer mindset from SQL statements to entity state transitions.
Once an entity is actively managed by Hibernate, all changes are going to be automatically propagated to the database.
Hibernate monitors currently attached entities. But for an entity to become managed, it must be in the right entity state.
First, we must define all entity states:
New (Transient)
A newly created object that hasn’t ever been associated with a Hibernate Session (a.k.a Persistence Context) and is not mapped to any database table row is considered to be in the New (Transient) state.
To become persisted we need to either explicitly call the EntityManager#persist method or make use of the transitive persistence mechanism.
Persistent (Managed)
A persistent entity has been associated with a database table row and it’s being managed by the current running Persistence Context. Any change made to such entity is going to be detected and propagated to the database (during the Session flush-time).
With Hibernate, we no longer have to execute INSERT/UPDATE/DELETE statements. Hibernate employs a transactional write-behind working style and changes are synchronized at the very last responsible moment, during the current Session flush-time.
Detached
Once the current running Persistence Context is closed all the previously managed entities become detached. Successive changes will no longer be tracked and no automatic database synchronization is going to happen.
To associate a detached entity to an active Hibernate Session, you can choose one of the following options:
Reattaching
Hibernate (but not JPA 2.1) supports reattaching through the Session#update method.
A Hibernate Session can only associate one Entity object for a given database row. This is because the Persistence Context acts as an in-memory cache (first level cache) and only one value (entity) is associated with a given key (entity type and database identifier).
An entity can be reattached only if there is no other JVM object (matching the same database row) already associated to the current Hibernate Session.
Merging
The merge is going to copy the detached entity state (source) to a managed entity instance (destination). If the merging entity has no equivalent in the current Session, one will be fetched from the database.
The detached object instance will continue to remain detached even after the merge operation.
Removed
Although JPA demands that managed entities only are allowed to be removed, Hibernate can also delete detached entities (but only through a Session#delete method call).
A removed entity is only scheduled for deletion and the actual database DELETE statement will be executed during Session flush-time.
To understand the JPA state transitions better, you can visualize the following diagram:
Or if you use the Hibernate specific API:

Hibernate overriding database modifications with detached object state

I'm gonna go with this design:
create an object and keep it alive during all web-app session.
And I need to synchronize its state with database state.
What I want to achieve is that :
IF between my db operations, that is, modifications that I persist to a db
someone intentionally spoils table rows, then on next saving to a database
all those changes WOULD BE OVERWRITTEN with the object state, that always contains valid data.
What Hibernate methods do you recommend me to use to persist the modifications in a database?
saveOrUpdate() is a possible solution, but maybe there's anything better?
Again, I repeat how it looks. First I create an object without collections. Persist it (save()).
Then user provides us with additional data. In a serviceLayer, again, we modify our object in memory (say, populate it with collections) and then, persist it again.
So every serviceLayer operation of the next step must simply guarantee that database contains the exact persistent copy of this object that we have in memory. If data in a database differ, it MUST BE OVERRIDDEN with the object (kept in memory) state.
What Session operations do you recommend?

FWIW saveOrUpdate() looks like the best option overall:
The saveOrUpdate() method is in practice more useful than update(),
save(), or lock(): In complex conversations, you don’t know if the item is in
detached state or if it’s new and transient and must be saved. The automatic
state-detection provided by saveOrUpdate() becomes even more useful when you
not only work with single instances, but also want to reattach or persist a network
of connected objects and apply cascading options.
However for your case, if you are sure the entity was modified in detached state, and/or don't mind occasionally hitting the DB with an unnecessary UPDATE, maybe update() is the safest choice:
The update() operation
on the Session reattaches the detached object to the persistence context and
schedules an SQL UPDATE. Hibernate must assume that the client modified the
object while it was detached. [...] The persistence context is flushed automatically
when the second transaction in the conversation commits, and any
modifications to the once detached and now persistent object are synchronized
with the database.
Quotes from Java Persistence with Hibernate, chapter 11.2.2.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.