I have a slow memory leak in my Java application. I was wondering if this could be caused by not always closing the Entitymanager when used. However using myeclipse to generate DB code, I'm getting methods like this:
public Meit update(Meit entity) {
logger.info("updating Meit instance");
try {
Meit result = getEntityManager().merge(entity);
logger.info("update successful");
return result;
} catch (RuntimeException re) {
logger.error("update failed");
throw re;
}
}
Which never close the EntityManager. Considering this is generated code, I'm wondering who's right, me or the IDE.
As #Ruggs said if you are managing the EntityManager lifecycle yourself (as opposed to having CMP Container Managed Persistence done by a J2EE) then you need to close the EntityManager yourself or at least call EntityManager.clear() to detach entities.
EntityManager are lightweight object so there is no need for just having one, you can create one for each transaction and close it after the transaction is committed.
All the entities that load/persist through an EntityManager stay in memory until you explicitly detach the entities from it (via EntityManager.detach() or EntityManager.clear() or EntityManager.close()). So it's better to have short-lived EntityManagers. If you persist 1000000 entities via the same EntityManager without detaching them after you will get a OOME (doesn't matter if you persist each entity in it's own EntityTransaction).
It's all explained in this post http://javanotepad.blogspot.com/2007/06/how-to-close-jpa-entitymanger-in-web.html.
As an example (taken from the earlier post) if you want to avoid "memory leaks" you should do something like this (if you are not using CMP):
EntityManager em = emf.createEntityManager();
try {
EntityTransaction t = em.getTransaction();
try {
t.begin();
// business logic to update the customer
em.merge(cust);
t.commit();
} finally {
if (t.isActive()) t.rollback();
}
} finally {
em.close();
}
Entity managers should generally have the same lifecycle as the application and not be created or destroyed on a per-request basis.
Your "memory leak" may be nothing more than the caching JPA is doing. You don't say which JPA provider you use but I know from experience that EclipseLink by default does extensive caching (which is part of the alleged benefits of JPA and ORM in general).
How do you know you have a memory leak?
Check whether it's really a leak
if so get the Eclipse Memory Analyzer and analyze it.
The blog posts here might also be useful.
It sounds like you are using an application managed EntityManager. You will need to call close the EntityManager yourself, it's part of the spec. You will also need to close the EntityManagerFactory when you shutdown your webapp.
I'd recommend using something like OpenEJB or Springframework to manage the EntityManager/EntityMangerFactory for you.
Related
I need to receive and save huge amount of data using spring data over hibernate. Our server allocated not enough RAM for persisting all entities at the same time. We will definitely get OutOfMemory error.
So we need to save data by batches it's obvious. Also we need to use #Transactional to be sure that all data persisted or non was persisted in case of even single error.
So, the question: does spring data during #Transactional method keep storing entities in RAM or entities which were flushed are accessible to garbage collector?
So, what is the best approach to process huge mount of data with spring data? Maybe spring data isn't right approach to solve problems like that.
Does spring data during #Transactional method keep storing entities in
RAM or entities which were flushed are accessible to garbage
collector?
The entities will keep storing in RAM (i.e in entityManager) until the transaction commit/rollback or the entityManager is cleared. That means the entities are only eligible for GC if the transaction commit/rollback or
entityManager.clear() is called.
So, what is the best approach to process huge mount of data with
spring data?
The general strategy to prevent OOM is to load and process the data batch by batch . At the end of each batch , you should flush and clear the entityManager such that the entityManager can release its managed entities for CG. The general code flow should be something like this:
#Component
public class BatchProcessor {
//Spring will ensure this entityManager is the same as the one that start transaction due to #Transactional
#PersistenceContext
private EntityManager em;
#Autowired
private FooRepository fooRepository;
#Transactional
public void startProcess(){
processBatch(1,100);
processBatch(101,200);
processBatch(201,300);
//blablabla
}
private void processBatch(int fromFooId , int toFooId){
List<Foo> foos = fooRepository.findFooIdBetween(fromFooId, toFooId);
for(Foo foo :foos){
//process a foo
}
/*****************************
The reason to flush is send the update SQL to DB .
Otherwise ,the update will lost if we clear the entity manager
afterward.
******************************/
em.flush();
em.clear();
}
}
Note that this practise is only for preventing OOM but not for achieving high performance. So if performance is not your concern , you can safely use this strategy.
I have the following code:
public Category findCategoryById(Long id) {
EntityManager em = emf.createEntityManager();
try {
em.getTransaction().begin();
Category category = categoryDAO.findCategoryById(em, id);
em.getTransaction().commit();
return category;
} catch (Exception e) {
throw e;
} finally {
em.close();
}
}
I'm handling the exceptions in my controller, but I want to make sure that entity manager is closed. I don't like that I am catching and re-throwing the error. I'm hoping to find better suggestions.
thanks
The best way is to not have to care about it. If your Entity Manager is container managed (for example if you are using ejb or spring and you haven't forced a specific bean/application managed behaviour) you should let the container handle the opening/close of the transaction and in general to worry about your persistence context. It's easier, safer and, with the exclusion of very specific cases, better. The manual close of the Entity Manager should be directly handled by you only in case of application managed context, to avoid connection pool exhaustion or other problems.
I'm using EJB3 and JPA2 in a project containing several modules.
Lately i have noticed that the DB-records won't rollback on exception. After doing some research i found that entity manager commits the transaction on flush immediatly even before the method ends, so that it can't rollback on exception.
I inject entity manager using
#PersistenceContext
private EntityManager entityManager;
To create a new record is persist and flush beeing called in the same class
entityManager.persist(entity);
entityManager.flush();
Even if i call throw new RuntimException("") right after flush, it wont rollback. On debug after flush is invoked i can select the DB-record with a database tool, before the method ends.
I already checked the persistence.xml and found nothing unusual. I dont use any other specifig configuration.
I'm out of ideas what might cause this behavior. I appriciate any clue.
You need to specify transaction boundaries, otherwise a new transaction will be opened and commited after each data manipulation query (INSERT, UPDATE, DELETE). As em.flush() will trigger such SQL query to be executed, it will open an implicit transaction and commit it if the SQL is successful, rollback in case of error.
In order to set transaction boundaries and make a RuntimeException trigger a rollback, the best option is to call entityManager methods from an EJB object. You must use a JTA datasource, not RESOURCE_LOCAL. If you don't use JTA datasource, you need to manage transactions by yourself, ie. by using entityManager.getTransaction() object.
Outside of an EJB, or with non-JTA datasource, you do not have any transaction open, unless you start it yourself by calling entityManager.getTransaction().begin(). However, in this way, your transaction will not be rolled back when Exception is thrown. Instead, you must roll back in catch block. This is mostly outside of Java EE container, in a Java SE application. In Java EE, I strongly suggest to use JTA datasource. Example:
public class NotAnEJB {
public persistEntity(EntityManager em, MyEntity entity) {
em.getTransaction().begin();
try {
em.persist(entity);
em.flush();
if (shouldFail()) {
throw new RuntimeException();
}
em.commit();
} catch (Exception e) {
em.rollback();
}
}
}
I'm just trying to get to know JSF and JPA but whenever I try to persist an object into the database it seems to not write away.
Here's the code I'm using:
#Named
#ManagedBean
#SessionScoped
public class BestemmingController implements Serializable{
#PersistenceUnit(unitName="RealDolmenTravelShopPU")
#PersistenceContext(unitName="RealDolmenTravelShopPU")
EntityManagerFactory emf = null;
public void submit(){
try{
emf = Persistence.createEntityManagerFactory("RealDolmenTravelShopPU");
EntityManager em = emf.createEntityManager();
//EntityTransaction et = em.getTransaction();
//et.begin();
Bestemming nieuweBestemming = new Bestemming();
Land gezochtLand = em.find(Land.class, selectedLand);
nieuweBestemming.setLand(gezochtLand);
nieuweBestemming.setNaam(bestemmingNaam);
em.persist(nieuweBestemming);
//et.commit();
//em.flush();
em.close();
}catch (Exception e){
e.printStackTrace();
}finally{
emf.close();
}
}
I tried using the EntityTransaction but it just stopped my application, without any errors or anything. So I left it out, but still it didn't write away.
So then I tried calling flush seperately, but that didn't do anything either.
I'm really stumped as to why this isn't working. It's probably some newbie mistake, but I would love it if someone here could help me out.
Thanks in advance!
First, are you able to write to the logs? Starting a transaction when specifying the persistence unit uses JTA will throw an exception, so it is likely you have just been missing exceptions in your container log files.
Second, this is a JTA PU, so it needs a JTA transaction started that the EM gets associated to, and you will want to inject the em rather than create a factory yourself. Check out the JPA application server examples here first to see how they are set up:
http://wiki.eclipse.org/EclipseLink/Examples/JPA
Hey I found out why it was that the transaction wasn't running: the implementation I used didn't use JTA, it used a RESOURCE_LOCAL persistence unit. That was something I just looked over when I set up my project.
Good thing my buddy told me to check the server logs.
In our J2EE application, we use a EJB-3 stateful bean to allow the front code to create, modify and save persistent entities (managed through JPA-2).
It looks something like this:
#LocalBean
#Stateful
#TransactionAttribute(TransactionAttributeType.NEVER)
public class MyEntityController implements Serializable
{
#PersistenceContext(type = PersistenceContextType.EXTENDED)
private EntityManager em;
private MyEntity current;
public void create()
{
this.current = new MyEntity();
em.persist(this.current);
}
public void load(Long id)
{
this.current = em.find(MyEntity.class, id);
}
#TransactionAttribute(TransactionAttributeType.REQUIRES_NEW)
public void save()
{
em.flush();
}
}
Very important, to avoid too early commits, only the save() method is within a transaction, so if we call create(), we insert nothing in the database.
Curiously, in the save() method, we have to call em.flush() in order to really hit the database. In fact, I tried and found that we can also call em.isOpen() or em.getFlushMode(), well anything that is "em-related".
I don't understand this point. As save() is in a transaction, I thought that at the end of the method, the transaction will be committed, and so the persistent entity manager automatically flushed. Why do I have to manually flush it?
Thanks,
Xavier
To be direct and to the metal, there will be no javax.transaction.Synchronization objects registered for the EntityManager in question until you actually use it in a transaction.
We in app-server-land will create one of these objects to do the flush() and register it with the javax.transaction.TransactionSynchronizationRegistry or javax.transaction.Transaction. This can't be done unless there is an active transaction.
That's the long and short of it.
Yes, an app server could very well keep a list of resources it gave the stateful bean and auto-enroll them in every transaction that stateful bean might start or participate in. The downside of that is you completely lose the ability to decide which things go in which transactions. Maybe you have a 2 or 3 different transactions to run on different persistence units and are aggregating the work up in your Extended persistence context for a very specific transaction. It's really a design issue and the app server should leave such decisions to the app itself.
You use it in a transaction and we'll enroll it in the transaction. That's the basic contract.
Side note, depending on how the underlying EntityManager is handled, any persistent call to the EntityManager may be enough to cause a complete flush at the end of the transaction. Certainly, flush() is the most direct and clear but a persist() or even a find() might do it.
If you use extended persistence context all operations on managed entities done inside non-transactional methods are queued to be written to the database. Once you call flush() on entity manager within a transaction context all queued changes are written to the database. So in other words, the fact that you have a transactional method doesn't commit the changes itself when method exits (as in CMT), but flushing entity manager actually does. You can find full explanation of this process here
Because there is no way to know "when" the client is done with the session (extended scope).