Hibernate: Walk millions of rows and don't leak memory

Hibernate: Walk millions of rows and don't leak memory - java

The below code functions, but Hibernate never lets go of its grip of any object. Calling session.clear() causes exceptions regarding fetching a joined class, and calling session.evict(currentObject) before retrieving the next object also fails to free the memory. Eventually I exhaust my heap space.
Checking my heap dumps, StatefulPersistenceContext is the garbage collector's root for all references pointing to my objects.
public class CriteriaReportSource implements JRDataSource {
private ScrollableResults sr;
private Object currentObject;
private Criteria c;
private static final int scrollSize = 10;
private int offset = 1;
public CriteriaReportSource(Criteria c) {
this.c = c;
advanceScroll();
}
private void advanceScroll() {
// ((Session) Main.em.getDelegate()).clear();
this.sr = c.setFirstResult(offset)
.setMaxResults(scrollSize)
.scroll(ScrollMode.FORWARD_ONLY);
offset += scrollSize;
}
public boolean next() {
if (sr.next()) {
currentObject = sr.get(0);
if (sr.isLast()) {
advanceScroll();
}
return true;
}
return false;
}
public Object getFieldValue(JRField jrf) throws JRException {
Object retVal = null;
if(currentObject == null) { return null; }
try {
retVal = PropertyUtils.getProperty(currentObject, jrf.getName());
} catch (Exception ex) {
Logger.getLogger(CriteriaReportSource.class.getName()).log(Level.SEVERE, null, ex);
}
return retVal;
}
}

Don't use the stateful session here, it's just NOT the right tool to walk millions of rows and build a report. Use The StatelessSession interface instead.
If using MySQL Connector/J even that is not enough, you need to also defeat the internal buffering done by the JDBC driver, with this:
Query query = session.createQuery(query);
query.setReadOnly(true);
// MIN_VALUE gives hint to JDBC driver to stream results
query.setFetchSize(Integer.MIN_VALUE);
ScrollableResults results = query.scroll(ScrollMode.FORWARD_ONLY);
// iterate over results
while (results.next()) {
Object row = results.get();
// process row then release reference
// you may need to evict() as well
}
results.close();

Couple of things I would suggest:
Try calling setCacheMode(CacheMode.IGNORE) on the Criteria before opening it.
In the advanceScroll() method, add if (sr != null) sr.close(); so that the previous ScrollableResults gets closed before you do the re-assignment to the new one.
One question: What is the reason for calling setMaxSize(), and then keeping track of the offset and then re-opening the scrollable results, why not just do this?
public CriteriaReportSource(Criteria c) {
this.c = c;
this.sr = c.setCacheMode(CacheMode.IGNORE)
.scroll(ScrollMode.FORWARD_ONLY);
}
public boolean next() {
if (sr.next()) {
currentObject = sr.get(0);
return true;
}
return false;
}

I think one of my problems was that
if (sr.isLast()) {
advanceScroll();
//...
combined with
((Session) Main.em.getDelegate()).clear();
//Also, "Main.em.clear()" should do...
resulted in flushing the database out one run too early. That was the cause of exceptions regarding collections. Collections cannot be handled in a StatelessSession, so that's off the table. I don't know why session.evict(currentObject) fails to work when Session.clear() does work, but that's the way I'll have to handle it for now. I'll toss the answer points to whoever can figure that one out.
So, for now, there we have an answer. A manual scrolling window is required, closing the ScrollableResults doesn't help, and I need to properly run a Session.clear().

Related

check if an object is used somewhere before deleting it in java

I have a situation where I need to give an error message when someone tries to delete an object b from a bList and it is used in some other class say class A.
If b is not referenced in another class then I should not throw an error message.
Pseudo code for the above scenario
public class A {
B b;
void setB(B b) {
this.b = b;
}
}
public class NotifyTest {
List<B> bList = new ArrayList<>();
String notifyTest() {
A a = new A();
B b = new B();
a.setB(b);
bList.add(b);
if (b referencedSomewhere)
{
return "error";
}
else
{
bList.remove(b);
return "success";
}
}
}
Traversing my entire model to check if object b is used somewhere is a performance hit so I don't want to go for that approach.
Please let me know if there is any solution for this scenario provided by Java or suggest a better way to handle this.
Edit1 : I need an error message when b is referenced in any other place other than bList

If your intention here is to automatically free up items from the list that are no longer referenced you can use https://docs.oracle.com/javase/7/docs/api/java/util/WeakHashMap.html
You could also use this to keep track of all keys that are not yet garbage collected. This can provide you the information about which items are already garbage collected (after becoming unreachable). However, the information won't be realtime as the garbage collector may run at arbitrary times.

I think something like the following should work for you. This is quickly put together to show you the idea. It has not been tested and will need more work if you want it to be thread safe.
class RefCounter<T>
{
private HashMap<T, Integer> counts = new HashMap<>();
public T using(T object)
{
Integer num = counts.get(object);
if (num == null)
counts.put(object, 1);
else
counts.put(object, num+1);
return object;
}
public T release(T object)
{
Integer num = counts.get(object);
if (num == null)
throw new IllegalArgumentException("Object not in RefCounter");
else if (num == 1)
counts.remove(object);
else
counts.put(object, num-1);
return object;
}
public boolean usedElsewhere(T object)
{
Integer num = counts.get(object);
return (num != null && num > 1);
}
}
When you use an object, add it to RefCounter.
refCounter.using(x);
someList.add(x);
When you are done with that object
someList.remove(index);
refCounter.release(x);
To test if the object is used elsewhere
if (refCounter.usedElsewhere(x) {
return "error";
} else {
someList.remove(index);
refCounter.release(x);
}
Remember you'll need to ensure you call using() and release() every time you keep or release an object, otherwise this is all pointless.

If the object is absolutely not used (or there's not much memory left) then java will mark it deleted, then when you start using up your memory, java will automatically do the garbage collection for you.
Most of the high level have garbage collection (GC), like java, C#, Python (iirc), etc. You only need to keep attention on memory if you use more low level languages, like C ir C++ (wich is somewhere between low and high level actually)

How to see a specific element is within a queue, and then get that element?

How do I search a queue for a specific object and then prioritise this object over the rest of the objects in the queue.
For example, I have a list of objects within the queue that each need to be processed, but I need to prioritise certain objects over others. At the moment it's a linked list so they're being executed in the order they are added to the queue, but say that an object was added that had priority, it needs to jump to the top of the queue to be executed before the rest.
private Queue<Packet> packetQueue = new LinkedList<Packet>();
#Override
public boolean handlePacketQueue() {
try {
Packet p = null;
synchronized (packetQueue) {
p = packetQueue.poll();
}
if (p == null) {
return false;
}
packetType = p.getPacketId();
packetSize = p.getPacketLength();
buffer = p.getPacketData();
if (packetType > 0) {
PacketManager.handlePacket(this, packetType, packetSize);
}
p = null;
return true;
} catch (Exception ex) {
ex.printStackTrace();
}
return false;
}
Now I need to prioritise a packet with a specific id, before executing any other. Can someone help me as to how I'd do this?

When construct PriorityQueue, you specify a Comparator to determine the order.
In JDK8, with lambda, you can do it like this,
private PriorityQueue<Packet> packetQueue = new PriorityQueue<>((p1, p2) ->
p1.getPacketId() == specified_id ? -1 : 0);
This gives specified_id a boost while others remains same. Implement your prioritization in the Comparator.

performace issue with the code ,exponential working

I have two lists of type object with data , the first one is principal entity and the second is dependent entity.
In addition I have key table that relate between the principal and depended entity objects.
In the first for statement I get one instance of type object and then I go and loop on every instance of the second entity and trying to find
Match between them (i think exponential problem…) ,if match is find update the principal entity with the reference object .
The following code is working but I check it from performance perspective and it's not working in efficient way.
Do you have an idea/tips how to improve this code from perforce aspect.
In the JVM monitor I found that EntityDataCreator.getInstanceValue have a problem.
This is the method start
// start with the principal entity
for (Object principalEntityInstance : principalEntityInstances) {
List<Object> genObject = null;
Object refObject = createRefObj(dependentMultiplicity);
// check entries in dependent entity
for (Object dependentEntityInstance : toEntityInstances) {
boolean matches = true;
for (String[] prop : propertiesMappings) {
// Get properties related keys
String fromProp = prop[0];
String toProp = prop[1];
Object fromValue = EntityDataCreator.getInstanceValue(fromProp, principalEntityInstance);
Object toValue = EntityDataCreator.getInstanceValue(toProp, dependentEntityInstance);
if (fromValue != null && toValue != null) {
if (!fromValue.equals(toValue)) {
matches = false;
break;
}
}
}
if (matches) {
// all properties match
if (refObject instanceof List) {
genObject = (List<Object>) refObject;
genObject.add(dependentEntityInstance);
refObject = genObject;
} else {
refObject = dependentEntityInstance;
break;
}
}
}
if (refObject != null) {
EntityDataCreator.createMemberValue(principalEntityInstance, navigationPropName, refObject);
}
}
public static Object getInstanceValue(String Property, Object EntityInstance) throws NoSuchFieldException,
IllegalAccessException {
Class<? extends Object> EntityObj = EntityInstance.getClass();
Field Field = EntityObj.getDeclaredField(Property);
Field.setAccessible(true);
Object Value = Field.get(EntityInstance);
Field.setAccessible(false);
return Value;
}

my guess would be your best bet is to go through both lists once, prepare all data that you need in hashtables, then do one iteration. this way, your problem becomes N+M instead of N*M
edit
Map<String,List<Object>> principalMap = new HashMap<String,List<Object>>();
for (Object principalEntityInstance : principalEntityInstances) {
List<String> keys = getKeysFor(principalEntityInstance);
for(String key : keys) {
List<Object> l = principalMap.get(key);
if(l==null) {
l = new ArrayList<Object>();
principalMap.put(key,l);
}
l.add(principalEntityInstance);
}
}
the do the same for dependentEntityInstance - this way, your searches will be much faster.

I might be misunderstanding your question, but I would suggest defining an equals method for your entities and a hashing method for them, so that you can leverage all the goodness that java already has for searching and matching entities already.
When at all possible rely on Java's infrastructure I think, Sun/Oracle spent a long time making it really fast.

Why is an iterator empty on a populated-on-demand collection?

I have the following code:
for (String helpId : helpTipFragCache.getKeys())
{
List<HelpTopicFrag> value = helpTipFragCache.getValue(helpId);
helpTipFrags.put(helpId, value);
}
The helpTipFragCache has a mechanism to load the cache if values are needed at it is empty. The getKeys() method triggers this and the cache is loaded when this is called. However in the above case, I see varying behavior.
I first debugged it quickly to see if the cache was indeed populating (within eclipse). I stepped through and the for loop was never entered (due to an empty iterator).
I then debugged it again (with the same code) and stepped into the getKeys() and analyzed the whole process there. It then did everything it was supposed to, the iterator had values to iterate over and there was peace in the console.
I have fixed the issue by changing the code to do this:
Set<String> helpIds = helpTipFragCache.getKeys();
helpIds = helpTipFragCache.getKeys();
for (String helpId : helpIds)
{
List<HelpTopicFrag> value = helpTipFragCache.getValue(helpId);
helpTipFrags.put(helpId, value);
}
Obviously the debugging triggered something to initialize or act differently, does anyone know what causes this? Basically, what is happening to create the iterator from the returned collection?
Some other pertinent information:
This code is executed on server startup (tomcat)
This code doesn't behave as expected when executed from an included jar, but does when it is in the same code base
The collection is a Set
EDIT
Additional Code:
public Set<String> getKeys() throws Exception
{
if (CACHE_TYPE.LOAD_ALL == cacheType)
{
//Fake a getValue call to make sure the cache is loaded
getValue("");
}
return Collections.unmodifiableSet(cache.keySet());
}
public final T getValue(String key, Object... singleValueArgs) throws Exception
{
T retVal = null;
if (notCaching())
{
if (cacheType == CACHE_TYPE.MODIFY_EXISTING_CACHE_AS_YOU_GO)
{
retVal = getSingleValue(key, null, singleValueArgs);
}
else
{
retVal = getSingleValue(key, singleValueArgs);
}
}
else
{
synchronized (cache)
{
if (needToLoadCache())
{
logger.debug("Need to load cache: " + getCacheName());
if (cacheType != CACHE_TYPE.MODIFY_EXISTING_CACHE_AS_YOU_GO)
{
Map<String, T> newCache = null;
if (cacheType != CACHE_TYPE.MODIFY_EXISTING_CACHE)
{
newCache = getNewCache();
}
else
{
newCache = cache;
}
loadCache(newCache);
cache = newCache;
}
lastUpdatedInMillis = System.currentTimeMillis();
forceLoadCache = false;
}
}
...//code in here does not execute for this example, simply gets a value that is already in the cache
}
return retVal;
}
And back to the original class (where the previous code was posted from):
#Override
protected void loadCache(
Map<String, List<HelpTopicFrag>> newCache)
throws Exception
{
Map<String, List<HelpTopicFrag>> _helpTipFrags = helpDAO.getHelpTopicFrags(getAppName(), _searchIds);
addDisplayModeToFrags(_helpTipFrags);
newCache.putAll(_helpTipFrags);
}
Above, a database call is made to get the values to be put in the cache.

The answer to
Basically, what is happening to create the iterator from the returned collection?
The for loop in your case treats Setas Iterable and uses an Iterator obtained by calling Iterable.iterator().
Set as = ...;
for (A a : as) {
doSth ();
}
is basically equivalent to
Set as = ...;
Iterator hidden = as.iterator ();
while (hidden.hasNext ()) {
a = hidden.next ();
doSth ();
}

JPA getSingleResult() or null

I have an insertOrUpdate method which inserts an Entity when it doesn't exist or update it if it does. To enable this, I have to findByIdAndForeignKey, if it returned null insert if not then update. The problem is how do I check if it exists? So I tried getSingleResult. But it throws an exception if the
public Profile findByUserNameAndPropertyName(String userName, String propertyName) {
String namedQuery = Profile.class.getSimpleName() + ".findByUserNameAndPropertyName";
Query query = entityManager.createNamedQuery(namedQuery);
query.setParameter("name", userName);
query.setParameter("propName", propertyName);
Object result = query.getSingleResult();
if (result == null) return null;
return (Profile) result;
}
but getSingleResult throws an Exception.
Thanks

Throwing an exception is how getSingleResult() indicates it can't be found. Personally I can't stand this kind of API. It forces spurious exception handling for no real benefit. You just have to wrap the code in a try-catch block.
Alternatively you can query for a list and see if its empty. That doesn't throw an exception. Actually since you're not doing a primary key lookup technically there could be multiple results (even if one, both or the combination of your foreign keys or constraints makes this impossible in practice) so this is probably the more appropriate solution.

Try this in Java 8:
Optional first = query.getResultList().stream().findFirst();

I encapsulated the logic in the following helper method.
public class JpaResultHelper {
public static Object getSingleResultOrNull(Query query){
List results = query.getResultList();
if (results.isEmpty()) return null;
else if (results.size() == 1) return results.get(0);
throw new NonUniqueResultException();
}
}

Here's a good option for doing this:
public static <T> T getSingleResult(TypedQuery<T> query) {
query.setMaxResults(1);
List<T> list = query.getResultList();
if (list == null || list.isEmpty()) {
return null;
}
return list.get(0);
}

I've done (in Java 8):
query.getResultList().stream().findFirst().orElse(null);

From JPA 2.2, instead of .getResultList() and checking if list is empty or creating a stream you can return stream and take first element.
.getResultStream()
.findFirst()
.orElse(null);

Spring has a utility method for this:
TypedQuery<Profile> query = em.createNamedQuery(namedQuery, Profile.class);
...
return org.springframework.dao.support.DataAccessUtils.singleResult(query.getResultList());

If you wish to use the try/catch mechanism to handle this problem.. then it can be used to act like if/else. I used the try/catch to add a new record when I didn't find an existing one.
try { //if part
record = query.getSingleResult();
//use the record from the fetched result.
}
catch(NoResultException e){ //else part
//create a new record.
record = new Record();
//.........
entityManager.persist(record);
}

Here's a typed/generics version, based on Rodrigo IronMan's implementation:
public static <T> T getSingleResultOrNull(TypedQuery<T> query) {
query.setMaxResults(1);
List<T> list = query.getResultList();
if (list.isEmpty()) {
return null;
}
return list.get(0);
}

There is an alternative which I would recommend:
Query query = em.createQuery("your query");
List<Element> elementList = query.getResultList();
return CollectionUtils.isEmpty(elementList ) ? null : elementList.get(0);
This safeguards against Null Pointer Exception, guarantees only 1 result is returned.

So don't do that!
You have two options:
Run a selection to obtain the COUNT of your result set, and only pull in the data if this count is non-zero; or
Use the other kind of query (that gets a result set) and check if it has 0 or more results. It should have 1, so pull that out of your result collection and you're done.
I'd go with the second suggestion, in agreement with Cletus. It gives better performance than (potentially) 2 queries. Also less work.

Combining the useful bits of the existing answers (limiting the number of results, checking that the result is unique) and using the estabilshed method name (Hibernate), we get:
/**
* Return a single instance that matches the query, or null if the query returns no results.
*
* #param query query (required)
* #param <T> result record type
* #return record or null
*/
public static <T> T uniqueResult(#NotNull TypedQuery<T> query) {
List<T> results = query.setMaxResults(2).getResultList();
if (results.size() > 1) throw new NonUniqueResultException();
return results.isEmpty() ? null : results.get(0);
}

The undocumented method uniqueResultOptional in org.hibernate.query.Query should do the trick. Instead of having to catch a NoResultException you can just call query.uniqueResultOptional().orElse(null).

I solved this by using List<?> myList = query.getResultList(); and checking if myList.size() equals to zero.

Look this code :
return query.getResultList().stream().findFirst().orElse(null);
When findFirst() is called maybe can be throwed a NullPointerException.
the best aproach is:
return query.getResultList().stream().filter(Objects::nonNull).findFirst().orElse(null);

Here's the same logic as others suggested (get the resultList, return its only element or null), using Google Guava and a TypedQuery.
public static <T> getSingleResultOrNull(final TypedQuery<T> query) {
return Iterables.getOnlyElement(query.getResultList(), null);
}
Note that Guava will return the unintuitive IllegalArgumentException if the result set has more than one result. (The exception makes sense to clients of getOnlyElement(), as it takes the result list as its argument, but is less understandable to clients of getSingleResultOrNull().)

Here's another extension, this time in Scala.
customerQuery.getSingleOrNone match {
case Some(c) => // ...
case None => // ...
}
With this pimp:
import javax.persistence.{NonUniqueResultException, TypedQuery}
import scala.collection.JavaConversions._
object Implicits {
class RichTypedQuery[T](q: TypedQuery[T]) {
def getSingleOrNone : Option[T] = {
val results = q.setMaxResults(2).getResultList
if (results.isEmpty)
None
else if (results.size == 1)
Some(results.head)
else
throw new NonUniqueResultException()
}
}
implicit def query2RichQuery[T](q: TypedQuery[T]) = new RichTypedQuery[T](q)
}

So all of the "try to rewrite without an exception" solution in this page has a minor problem. Either its not throwing NonUnique exception, nor throw it in some wrong cases too (see below).
I think the proper solution is (maybe) this:
public static <L> L getSingleResultOrNull(TypedQuery<L> query) {
List<L> results = query.getResultList();
L foundEntity = null;
if(!results.isEmpty()) {
foundEntity = results.get(0);
}
if(results.size() > 1) {
for(L result : results) {
if(result != foundEntity) {
throw new NonUniqueResultException();
}
}
}
return foundEntity;
}
Its returning with null if there is 0 element in the list, returning nonunique if there are different elements in the list, but not returning nonunique when one of your select is not properly designed and returns the same object more then one times.
Feel free to comment.

I achieved this by getting a result list then checking if it is empty
public boolean exist(String value) {
List<Object> options = getEntityManager().createNamedQuery("AppUsers.findByEmail").setParameter('email', value).getResultList();
return !options.isEmpty();
}
It is so annoying that getSingleResult() throws exceptions
Throws:
NoResultException - if there is no result
NonUniqueResultException - if more than one result
and some other exception that you can get more info on from their documentation

I prefer #Serafins answer if you can use the new JPA features, but this is one fairly straight forward way to do it which I'm surprised hasn't been mentioned here before:
try {
return (Profile) query.getSingleResult();
} catch (NoResultException ignore) {
return null;
}

`public Example validate(String param1) {
// TODO Auto-generated method stub
Example example = new Example();
Query query =null;
Object[] myResult =null;
try {
query = sessionFactory.getCurrentSession()
.createQuery("select column from table where
column=:p_param1");
query.setParameter("p_param1",param1);
}
myResult = (Object[])query.getSingleResult();//As your problem occurs here where the query has no records it is throwing an exception
String obj1 = (String) myResult[0];
String obj2 = (String) myResult[1];
example.setobj1(ISSUtil.convertNullToSpace(obj1))
example.setobj2(ISSUtil.convertNullToSpace(obj2));
return example;
}catch(Exception e) {
e.printStackTrace();
example.setobj1(ISSUtil.convertNullToSpace(""));//setting
objects to "" in exception block
example.setobj1(ISSUtil.convertNullToSpace(""));
}
return example;
}`
Answer : Obviously when there is no records getsingleresult will throw an exception i have handled it by setting the objects to "" in the exception block even though it enter the exception you JSON object will set to ""/empty
Hope this is not a perfect answer but it might help
If some needs to modify my code more precisely and correct me always welcome.

Thats works to me:
Optional<Object> opt = Optional.ofNullable(nativeQuery.getSingleResult());
return opt.isPresent() ? opt.get() : null;

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Hibernate: Walk millions of rows and don't leak memory - java

Related

check if an object is used somewhere before deleting it in java

How to see a specific element is within a queue, and then get that element?

performace issue with the code ,exponential working

Why is an iterator empty on a populated-on-demand collection?

JPA getSingleResult() or null

Categories

Resources