Converting Between Hibernate Collections and My Own Collections - java

I have set up Hibernate to give me a Set<Integer> which I convert internally to and from a Set<MyObjectType> (MyObjectType can be represented by a single integer). That is to say, When Hibernate calls my void setMyObjectTypeCollection(Set<Integer> theSet) method I iterate through all the elements in theSet and convert them to MyObjectType. When Hibernate calls my Set<MyObjectType> getMyObjectTypeCollection() I allocate a new HashSet and convert MyObjectTypes to Integers.
The problem is that every time I call commit, Hibernate deletes everything in the collection and then re-inserts it regardless of whether any element of the collection has changed or even that the collection itself has changed.
While I don't technically consider this a bug, I am afraid that deleting and inserting many rows very often will cause the database to perform unnecessarily slowly.
Is there a way to get Hibernate to recognize that even though I have allocated and returned a different instance of the collection, that the collection actually contains all the items it used to and that there is no need to delete and reinsert them all?

I think the best way to achieve your goal would be to use a UserType. Basically it lets you handle the conversion from SQL to your own objects (back and forth).
You can see an example on how to use it here.

Related

List vs Set on JPA 2 - Pros / Cons / Convenience

I have tried searching on Stack Overflow and at other websites the pros, cons and conveniences about using Sets vs Lists but I really couldn't find a DEFINITE answer for when to use this or that.
From Hibernate's documentation, they state that non-duplicate records should go into Sets and, from there, you should implement your hashCode() and equals() for every single entity that could be wrapped into a Set. But then it comes to the price of convenience and ease of use as there are some articles that recommend the use of business-keys as every entity's id and, from there, hashCode() and equals() could then be perfectly implemented for every situation regardless of the object's state (managed, detached, etc).
It's all fine, all fine... until I come across on lots of situations where the use of Sets are just not doable, such as Ordering (though Hibernate gives you the idea of SortedSet), convenience of collectionObj.get(index), collectionObj.remove(int location || Object obj), Android's architecture of ListView/ExpandableListView (GroupIds, ChildIds) and on... My point is: Sets are just really bad (imho) to manipulate and make it work 100%.
I am tempted to change every single collection of my project to List as they work very well. The IDs for all my entities are generated through MYSQL's auto-generated sequence (#GeneratedValue(strategy = GenerationType.IDENTITY)).
Is there anyone out the who could in a definite way clear up my mind in all these little details mentioned above?
Also, is it doable to use Eclipse's auto-generated hashCode() and equals() for the ID field for every entity? Will it be effective in every situation?
Thank you very much,
Renato
List versus Set
Duplicates allowed
Lists allow duplicates and Sets do not allow duplicates. For some this will be the main reason for them choosing List or Set.
Multiple Bag's Exception - Multiple Eager fetching in same query
One notable difference in the handling of Hibernate is that you can't fetch two different lists in a single query.
It will throw an exception "cannot fetch multiple bags". But with sets, no such issues.
A list, if there is no index column specified, will just be handled as a bag by Hibernate (no specific ordering).
#OneToMany
#OrderBy("lastname ASC")
public List<Rating> ratings;
One notable difference in the handling of Hibernate is that you can't fetch two different lists in a single query. For example, if you have a Person entity having a list of contacts and a list of addresses, you won't be able to use a single query to load persons with all their contacts and all their addresses. The solution in this case is to make two queries (which avoids the cartesian product), or to use a Set instead of a List for at least one of the collections.
It's often hard to use Sets with Hibernate when you have to define equals and hashCode on the entities and don't have an immutable functional key in the entity.
furthermore i suggest you this link.

Problems that we use a BiMap to solve

I'm reviewing the capabilities of Googles Guava API and I ran into a data structure that I haven't seen used in my 'real world programming' experience, namely, the BiMap. Is the only benefit of this construct the ability to quickly retrieve a key, for a given value? Are there any problems where the solution is best expressed using a BiMap?
Any time you want to be able to do a reverse lookup without having to populate two maps. For instance a phone directory where you would like to lookup the phone number by name, but would also like to do a reverse lookup to get the name from the number.
Louis mentioned the memory savings possible in a BiMap implementation. That's the only thing that you can't get by wrapping two Map instances. Still, if you let us wrap the Map instances for you, we can take care of a few edges cases. (You could handle all these yourself, but why bother? :))
If you call put(newKey, existingValue), we'll error out immediately to keep the two maps in sync, rather than adding the entry to one map before realizing that it conflicts with an existing mapping in the other. (We provide forcePut if you do want to override the existing value.) We provide similar safeguards for inserting null or other invalid values.
BiMap views keep the two maps in sync: If you remove an element from the entrySet of the original BiMap, its corresponding entry is also removed from the inverse. We do the same kind of thing in Entry.setValue.
We handle serialization: A BiMap and its inverse stay "connected," and the entries are serialized only once.
We provide a smart implementation of inverse() so that foo.inverse().inverse() returns foo, rather than a wrapper of a wrapper.
We override values() to return a Set. This set is identical to what you'd get from inverse().keySet() except that it maintains the same iteration order as the original BiMap.

Java map content comparison

Here is a tricky data structure and data organization case.
I have an application that reads data from large files and produces objects of various types (e.g., Boolean, Integer, String) that are categorized in a few (less than a dozen) groups and then stored in a database.
Each object is currently stored in a single HashMap<String, Object> data structure. Each such HashMap corresponds to a single category (group). Each database record is built from the information in all the objects contained in all categories (HashMap data structures).
A requirement has appeared for checking whether subsequent records are "equivalent" in the number and type of columns, where equivalence must be verified across all maps by comparing the name (HashMap key) and the type (actual class) of each stored object.
I am looking for an efficient way of implementing this functionality, while maintaining the original object categorization, because listing objects by category in the fastest possible way is also a requirement.
An idea would be to just sort the keys (e.g., by replacing each HashMap with a TreeMap) and then walk over all maps. An alternative would be to just copy everything in a TreeMap for comparison purposes only.
What would be the most efficient way of implementing this functionality?
Also, if how would you go about finding the difference (i.e., the fields added and those removed), between successive records?
Create a meta SortedSet in which you store all the created maps.
Means SortedSet<Map<String,Object>> e.g. a TreeSet which as a custom Comparator<Map<String,Object>> which does check exactly your requirements of same number and names of keys and same object type per value.
You can then use the contains() method of this meta set structure to find out if a similar record does already exist.
==== EDIT ====
Since I've misundertood the relation between database records and the maps in the first place, I've to change some semantics my answer now of course a little bit.
Still I'would use the mentioned SortedSet<Map<String,Object>> but of course the Map<String,Object> would now point to that Map you and havexy suggested.
On the other hand could it be a step forward to use a Set<Set<KeyAndType>> or SortedSet<Set<KeyAndType>> where your KeyAndType will only contain the key and the type with appropriate Comparable implementation or equals with hashcode.
Why? You asked how to find the differences between two records? If each record relates to one of those inner Set<KeyAndType> you can easily use retainAll() to form the intersection of two successive Sets.
If you would compare this to the idea of a SortedSet<Map<String,Object>>, in both ways you would have the logic which differenciates between the fields within the comparator, one time comparing inner sets, one time comparing inner maps. And since this information gets lost when the surrounding set is constructed, it will be hard to get the differences between two records later on, if you do not have another reduced structure which is easy to use to find such differences. And since such a Set<KeyAndType> could act as key as well as as easy base for comparison between two records, it could be a good candidate to be used for both purposes.
If furthermore you wanna keep the relation between such a Set<KeyAndType> to your record or the group of Map<String,Object> your meta structure could be something like:
Map<Set<KeyAndType>,DatabaseRecord> or Map<Set<KeyAndType>,GroupOfMaps> implemented by a simple LinkedHashMap which allows simple iteration in original order.
One soln is to keep both category based HashMap and combined TreeMap. This will have slight more memory requirement, not much though, as you ll just keep the same reference in both of them.
So whenever you are adding/removing to HashMap you will do the same operation in the TreeMap too. This way both will always be in sync.
You can then use TreeMap for comparison, whether you want comparison of type of object or actual content comparison.

db4o to preserve identity of objects

Is there a way to preserve an object identity in db4o.
Suppose I store a BigDecimal in embedded db4o.
When I read it twice I get two distinct objects with the same value (which is quite obvious).
Is there any setting to force db4o to cache query results so that two queries would return reference to the same instance, or do I have to do it myself ?
From my experience, running the same query twice on the same ObjectContainer should return the same (identical) objects each time.
You should not close and reopen the ObjectContainer between the queries, if you need the objects' identity.
Db4o does use IDs and UUIDs internally and you can access those if needed. Also worth reading is this.
you can make an id for each object of yours by using it's UUIDs, I mean add an attribute ID for the object and give it UUIDs value and store it, to update an object you can retrieve it by that Id and update it

Is it valid for Hibernate list() to return duplicates?

Is anyone aware of the validity of Hibernate's Criteria.list() and Query.list() methods returning multiple occurrences of the same entity?
Occasionally I find when using the Criteria API, that changing the default fetch strategy in my class mapping definition (from "select" to "join") can sometimes affect how many references to the same entity can appear in the resulting output of list(), and I'm unsure whether to treat this as a bug or not. The javadoc does not define it, it simply says "The list of matched query results." (thanks guys).
If this is expected and normal behaviour, then I can de-dup the list myself, that's not a problem, but if it's a bug, then I would prefer to avoid it, rather than de-dup the results and try to ignore it.
Anyone got any experience of this?
Yes, getting duplicates is perfectly possible if you construct your queries so that this can happen. See for example Hibernate CollectionOfElements EAGER fetch duplicates elements
I also started noticing this behavior in my Java API as it started to grow. Glad there is an easy way to prevent it. Out of practice I've started out appending:
.setResultTransformer(Criteria.DISTINCT_ROOT_ENTITY)
To all of my criteria that return a list. For example:
List<PaymentTypeAccountEntity> paymentTypeAccounts = criteria()
.setResultTransformer(Criteria.DISTINCT_ROOT_ENTITY)
.list();
If you have an object which has a list of sub objects on it, and your criteria joins the two tables together, you could potentially get duplicates of the main object.
One way to ensure that you don't get duplicates is to use a DistinctRootEntityResultTransformer. The main drawback to this is if you are using result set buffering/row counting. The two don't work together.
I had the exact same issue with Criteria API. The simple solution for me was to set distinct to true on the query like
CriteriaQuery<Foo> query = criteriaBuilder.createQuery(Foo.class);
query.distinct(true);
Another possible option that came to my mind before would be to simply pass the resulting list to a Set which will also by definition have just an object's single instance.

Categories