GAE, JDO: how to cache a parent entity without it's children

GAE, JDO: how to cache a parent entity without it's children - java

I have a problem that whenever I load a parent entity (User in my case) and put it to cache, all it's children (in an owned relationship) are cached as well.
If I'm not wrong, the explanation is simple: the serialization process touches all properties of the object which causes that all child object are fetched as well. Eventually, the whole entity group is fetched.
How do I avoid that? The User entity group is planned to contain quite a lot of information and I don't want to cache it all at once. Not to mention that fetching all the child objects at once would be really demanding.
I came across transient modifier and was happy for a while until I realized, that not only it stops certain fields from getting cached, it also prevents those fields from getting persistent.

So the answer is to use the detached version of the entity. I load all entities using one function which looks right now something like this:
#SuppressWarnings("unchecked")
E cachedEntity = (E) cache.get(cacheKey);
if (cachedEntity != null) {
entity = cachedEntity;
}
else {
entity = pm.getObjectById(Eclass, key);
cache.put(cacheKey, pm.detachCopy(entity));
}
The disadvantage is, that when I want to get the child objects, I have to explicitly attach the entity back using entity = pm.makePersistent(entity) which generates Datastore.GET RPC. However this doesn't happen very often and most likely I just want to access the entity itself, not its child objects, therefore it's quite efficient.
I came across an even better solution. The one RPC call when attaching the entity is there because the JDO checks whether the entity really exists in the datastore. And according to the DataNucleus documentation, this can be turned off just by setting datanucleus.attachSameDatastore to false in PMF. However it doesn't work for me, Datastore.GET is always called when attaching the object. If it worked, I could implicitly attach every object just after fetching it from the cache with zero cost and I wouldn't have to do it manually when needed.

Related

Why JPA Entity select result changes on each even query?

My question is related to strange read/select behavior when same query returns different results after each call. Description of my situation is written below:
I have the following code, returning list of documents from DB
#RequestMapping(value={"/docs"}, method = RequestMethod.GET)
#ResponseBody
public ArrayList<Document> getMetaData(ModelMap modelMap) {
return (ArrayList<Document>)documentDAO.getDocuments();
}
DocumentDAO.getDocuments looks like
public List<Document> getDocuments() {
Query query = entityManager.createQuery("from Document");
List<Document> list = query.getResultList();
for(Document doc:list) System.out.println(doc.getName()+" "+doc.isSigned());
return list;
}
In other controller, I'm also extracting Document and changing boolean property with
Document doc = documentDAO.getDocumentById(id)
doc.setSigned(true);
documentDAO.updateDocument(doc); // IS IT NECESSARY??
getById and updateDocument are the following:
public Document getDocumentById(Long id) {
return entityManager.find(Document.class, id);
}
#Transactional
public void updateDocument(Document document) {
entityManager.merge(document);
entityManager.flush();
}
Questions:
As far as I know, setting property of managed object is enough to propagate changes to DB. But I want to flush changes immediately. Is my approach with extra call of update is appropriate solution or calling setter is enough for making immediate changes in DB? By extra update I mean documentDAO.updateDocument(doc); // IS IT NECESSARY??
How JPA stores managed objects - in some internal data structure or simply keeps them in references like Document doc;? Internal structure most likely makes duplicate/sameID managed object impossible, references most likely makes possible to have multiple managed objects with same id and other properties.
How merge works internally - tries to find managed object with the same ID in internal storage and, in the case of detecting, refreshes it's fields or simply updates DB?
If internal storage really exists (most likely this is persistence context, futher PC), what is criteria for distinquish managed objects? #Id annotated field of hibernate model?
My main problem is different results of entityManager.createQuery("from Document");
System.out.println(doc.getName()+" "+doc.isSigned()); shows isSigned true on odd calls and false on even calls.
I suspect that first select-all-query returns entities with isSigned=false and put them to PC, after that user performs some operation which grabs entity byID, sets isSigned=true and just extracted entity conflicts with already presented in PC. First object has isSigned=false, second has isSigned=true and PC confused and returns different managed objects in rotation. But how its possible? In my mind, PC has mechanisms to not allow such confusing ambigious situations by keeping only one managed object for each unique id.

First of all you want to enrol both the read and the write in a single transactional service method:
#Transactional
public void signDocument(Long id) {
Document doc = documentDAO.getDocumentById(id)
doc.setSigned(true);
}
So this code should reside on the Service side, not in your web Controller.
As far as I know, setting property of managed object is enough to propagate changes to DB. But I want to flush changes immediately. Is
my approach with extra call of update is appropriate solution or
calling setter is enough for making immediate changes in DB? By extra
update I mean documentDAO.updateDocument(doc); // IS IT NECESSARY??
This applies only to managed entities, as long as the Persistence Context is still open. That's why you need a transactional service method instead.
How JPA stores managed objects - in some internal data structure or simply keeps them in references like Document doc;? Internal structure
most likely makes duplicate/sameID managed object impossible,
references most likely makes possible to have multiple managed objects
with same id and other properties.
The JPA 1st level cache simply stores entities as they are, it doesn't use any other data representation. In a Persistence Context you can have one and only one entity representation (Class and Identifier). In the context of a JPA Persistence Context, the managed entity equality is the same with entity identity.
How merge works internally - tries to find managed object with the
same ID in internal storage and, in the case of detecting, refreshes
it's fields or simply updates DB?
The merge operation makes sense for reattaching detached entities. A managed entity state is automatically synchronized with the database during flush-time. The automatic dirty checking mechanism takes care of this.
If internal storage really exists (most likely this is persistence context, further PC), what is criteria for distinguish managed objects? #Id annotated field of hibernate model?
The PersistenceContext is a session-level cache. The managed objects always have an identifier and an associated database row.
I suspect that first select-all-query returns entities with
isSigned=false and put them to PC, after that user performs some
operation which grabs entity byID, sets isSigned=true and just
extracted entity conflicts with already presented in PC.
In the same Persistence Context scope this can't ever happen. If you load an entity through a query, the entity gets caches in the 1st level cache. If you try to load it again with another query or with the EntityManager.find() you will still get the same object reference, that's already cached.
If the first query happens against a Persistence Context and the second query/find will be issued on a second Persistence Context, then each Persistence Context will have to cache its own version of the entities being queried.
First object has isSigned=false, second has isSigned=true and PC
confused and returns different managed objects in rotation. But how
its possible?
This can't happen. The Persistence Context always maintains entity object integrity.

Maintaining relationships in JPA 2.0

I've been using JPA 2.0 for a while but, sad to admit, I haven't had enough time to learn it properly. It seems like I lack the basics of how to work with Entity Manager.
Moving one step at a time, I'd like to first ask you about maintaining relationships between mapped entities. Of course I know how to create mappings between entities, different types of available associations (OneToOne, etc.) and how databases work in general. I'm purely focused on maintaining it via Entity Manager, so please do not send me to any kind of general knowledge tutorial :-).
The questions are:
Am I right that as a programmer I'm responsible for maintaining (creating/updating/removing) relationships between instances of entities?
Do I have to always update (set to null, remove from collection, etc.) instances by hand?
Plain SQL can set entities to NULL on deleting, but it seems like JPA can't do such a simple thing. It also seems like a burden to do it manually. Is there a way to achieve that with JPA?
If I have OneToMany relationship and set to NULL the entity on the Many side of the relationship. Then I persist the changes in a Set by saving the entity on the One side. Do I then have to update the entities in the Many side and set association to NULL in each instance? Seems pure silliness for one-directional bindings!
Thanks in advance!

The main thing you need to investigate is the different options you have when mapping on entity. For example in the next piece of code the cascade all option will instruct jpa to delete the child list when the parent is deleted.
#OneToMany(fetch = FetchType.LAZY, cascade = { CascadeType.ALL }, mappedBy = "parent")
private Set<Child> events = new HashSet<Child>();

Yes. You maintain the object tree and modify it to look like what
you want.
Yes and no. If you want the entity to reference null, then yes.
For instance, if you are removing one Entity, then you should clean
up any references to it held by other entities that you are not
removing. A practical example: its good practice to let an Employee
know his/her Manager has been let go. If the Employee is going to
stay, it should either have its manager reference nulled out or set
to a different manager, before the current manager can be removed.
If the employee is going to be removed as well, then cascade remove
can cascade to all the Manager's subordinates, in which case you do
not need to clean up their references to the manager - as they are
going away too.
I don't quite understand what SQL is setting to null. Deleting
removes the row in the database, so there isn't anything to set to
null. Cleaning up a reference shouldn't be that difficult in the
object model, as JPA has a number of events to help such as
preremove preupdate etc. In the end though, the problem is with
your java objects. They are just java objects, so if you want
something done, your application will need to do it for the most
part. JPA handles building them and pushing them to the database,
not changing the state for you.
Yes and no. If you set up a bidirectional relationship, you must
maintain both sides as mentioned above. If you set the child's
parent reference to null, you should let the parent know it no
longer has a child, wouldn't you? Your parent will continue to
reference its child for as long as that Parent instance exists. So
even though the database is updated/controlled through the side that
owns a relationship, the object model will be out of synch with the
database until it is refreshed or somehow reloaded. JPA allows for
multiple levels of caching, so it all depends on your provider setup
how long that Parent instance will exist referencing a Child that no
longer exists in the database.

Hibernate needed values to save a child

I currently have working code to save children to a parent entity. But I'm wondering if I'm doing things right since I now have an overload on select statements going thru hibernate. I do use caching so atm I don't have delay problems but I'm wondering if I can't be more efficient. Take this little extract as example
MbaLog.debugLog(logger, "Saving CodeType");
Site site = codeType.getSite();
if (site != null && site.isProxy())
codeType.setSite(siteRepository.loadSiteById(site.getId()));
Long recordId = codeRepository.saveCodeType(codeType);
I have an entity CodeType that I'm saving that has a child Site. This child is passed to the method as a proxy object with just it's id filled in. Then I fetch a fully loaded Site object from the database and set it on codetype. Next up I save the codeType with the sessionfactory of hibernate to the database (code not visible here, but it's behind the codeRepository).
This works but I'm loading a full site, that has childs of it's own so I see at least 5 queries passing before the insert. I could put a lot of stuff lazy on site, but for the time being I rather not do that due to possible code complications in deeper layers. I had to learn hibernate and JPA on the job and never had much training from experts in the past. So I'm wondering, is there a shortcut to save the site on codetype ? Do I need to have it fully loaded or is the id enough ? or just the id and version (I'm using #version annotation on all my entities for optimistic locking)
Thanks in advance

Instead of using Session.get() (or EntityManager.find()) to get a reference to the SIte entity, use Session.load() (or EntityManager.getReference()) to get this reference.
These methods will return a lazy-loaded proxy on the entity rather than executing a query to get the data of the site.

If all you want to persist is the relationship between Site and CodeType, a lazy instance is probably enough. So you could use EntityManager.getReference() (lazy load) instead of EntityManager.find().

updating "nested" objects with JDO on Google App Engine

I'm having trouble figuring out the proper way to update "nested" data using Google App Engine
and JDO. I have a RecipeJDO and an IngredientJDO.
I want to be able to completely replace the ingredients in a given recipe instance with a new list of ingredients. Then, when that recipe is (re)persisted, any previously attached ingredients will be deleted totally from the datastore, and the new ones will be persisted and associated with the recipe.
Something like:
// retrieve from GAE datastore
RecipeJDO recipe = getRecipeById();
// fetch new ingredients from the user
List<IngredientJDO> newIngredients = getNewIngredients();
recipe.setIngredients(newIngredients);
// update the recipe w/ new ingredients
saveUpdatedRecipe(recipe);
This works fine when I update (detached) recipe objects directly, as returned from the datastore. However if I copy a RecipeJDO, then make the aforementioned updates, it ends up appending the new ingredients, which are then returned along with the old ingredients when the recipe is then re-fetched from the datastore. (Why bother with the copy at all? I'm using GWT on the front end, so I'm copying the JDO objects to DTOs, the user edits them on the front end, and then they are sent to the backend for updating the datastore.)
Why do I get different results with objects that I create by hand (setting all the fields, including the id) vs operating on instances returned by the PersistenceManager? Obviously
JDO's bytecode enhancement is involved somehow.
Am I better off just explicitly deleting the old ingredients before persisting the updated
recipe?
(Side question- does anyone else get frustrated with ORM and wish we could go back to plain old RDBMS? :-)

Short answer. Change RecipeJDO.setIngredients() to this:
public void setIngredients(List<IngredientJDO> ingredients) {
this.ingredients.clear();
this.ingredients.addAll(ingredients);
}
When you fetch the RecipeJDO, the ingredients list is not a real ArrayList, it is a dynamic proxy that handles the persistence of the contained elements. You shouldn't replace it.
While the persistence manager is open, you can iterate through the ingredients list, add items or remove items, and the changes will be persisted when the persistence manager is closed (or the transaction is committed, if you are in a transaction). Here's how you would do the update without a transaction:
public void updateRecipe(String id, List<IngredientDTO> newIngredients) {
List<IngredientJDO> ingredients = convertIngredientDtosToJdos(newIngredients);
PersistenceManager pm = PMF.get().getPersistenceManager();
try {
RecipeJDO recipe = pm.getObjectById(RecipeJDO.class, id);
recipe.setIngredients(ingredients);
} finally {
pm.close();
}
}
If you never modify the IngredientJDO objects (only replace them and read them), you might want to make them Serializable objects instead of JDO objects. If you do that, you may be able to reuse the Ingredient class in your GWT RPC code.
Incidentally, even if Recipe was not a JDO object, you would want to make a copy in the setIngredients() method, otherwise someone could do this:
List<IngredientJDO> ingredients = new ArrayList<IngredientJDO>;
// add items to ingredients
recipe.setIngredients(ingredients);
ingredients.clear(); // Woops! Modifies Recipe!

I am facing the same problem!
I would like to update an existing entity by calling makePersistent() and assigning an existent id/key! the update works fine except for nested objects! The nested objects are appended to the old ones instead of being replaced? I don't know if this is the intended behaviour or if this is a bug? I expect overwriting to have the same effect as inserting a new entity!
How about first deleting the old entity and persisting the new one in the same transaction? Does this work? I tried this but it resulted in deleting the entity completely?! I don't know why (even though I tried flushing directly after deleting)!

#NamshubWriter, not sure if you'll catch this post... regarding your comment,
(if you used Stripes and JSP, you could avoid the GWT RPC and GWT model representations of Recipe and Ingredient)
I am using Stripes and JSP, but I face the same problem. When the user submits the form back, Stripes instantiates my entity objects from scratch, and so JDO is completely ignorant of them. When I call PersistenceManager.makePersistent on the root object, the previous version is correctly overwritten - with one exception, its child objects are appended to the List<child> of the previous version.
If you could suggest any solution (better than manually copying the object fields) I would greatly appreciate.
(seeing as Stripes is so pluggable, I wonder if I can override how it instantiates the entity objects...)

Hiding deleted objects

I have the following use case: There's a class called Template and with that class I can create instances of the ActualObject class (ActualObject copies its inital data from the Template). The Template class has a list of Product:s.
Now here comes the tricky part, the user should be able to delete Products from the database but these deletions may not affect the content of a Template. In other words, even if a Product is deleted, the Template should still have access to it. This could be solved by adding a flag "deleted" to the Product. If a Product is deleted, then it may not be searched explicitly from the database, but it can be fetched implicitly (for example via the reference in the Template class).
The idea behind this is that when an ActualObject is created from a template, the user is notified in the user interface that "The Template X had a Product Z with the parameters A, B and C, but this product has been deleted and cannot be added as such in ActualObject Z".
My problem is how I should mark these deleted objects as deleted. Before someone suggests that just update the delete flag instead of doing an actual delete query, my problem is not that simple. The delete flag and its behaviour should exist in all POJOs, not just in Product. This means I'll be getting cascade problems. For example, if I delete a Template, then the Products should also be deleted and each Product has a reference to a Price-object which also should be deleted and each Price may have a reference to a VAT-object and so forth. All these cascaded objects should be marked as deleted.
My question is how can I accomplish this in a sensible manner. Going through every object (which are being deleted) checking each field for references which should be deleted, going through their references etc is quite laborious and bugs are easy to slip in.
I'm using Hibernate, I was wondering if Hibernate would have any such inbuilt features. Another idea that I came to think of was to use hibernate interceptors to modify an actual SQL delete query to an update query (I'm not even 100% sure this is possible). My only concern is that does Hibernate rely on cascades in foreign keys, in other words, the cascaded deletes are done by the database and not by hibernate.

My problem is how I should mark these
deleted objects as deleted.
I think you have choosen a very complex way to solve the task. It would be more easy to introduce ProductTemplate. Place into this object all required properties you need. And also you need here a reference to a Product instance. Than instead of marking Product you can just delete it (and delete all other entities, such as prices). And, of course, you should clean reference in ProductTemplate. When you are creating an instance of ActualObject you will be able to notify the user with appropriate message.

I think you're trying to make things much more complicated than they should be... anyway, what you're trying to do is handling Hibernate events, take a look at Chapter 12 of Hibernate Reference, you can choose to use interceptors or the event system.
In any case... well good luck :)

public interface Deletable {
public void delete();
}
Have all your deletable objects implement this interface. In their implementations, update the deleted flag and have them call their children's delete() method also - which implies that the children must be Deletable too.
Of course, upon implementation you'll have to manually figure which children are Deletable. But this should be straightforward, at least.

If I understand what you are asking for, you add an #OneToMany relationship between the template and the product, and select your cascade rules, you will be able to delete all associated products for a given template. In your product class, you can add the "deleted" flag as you suggested. This deleted flag would be leveraged by your service/dao layer e.g. you could leverage a getProdcuts(boolean includeDeleted) type concept to determine if you should include the "deleted" records for return. In this fashion you can control what end users see, but still expose full functionality to internal business users.

The flag to delete should be a part of the Template Class itself. That way all the Objects that you create have a way to be flagged as alive or deleted. The marking of the Object to be deleted, should go higher up to the base class.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.