Saving order of a List in JPA - java

I have the following question about JPA:
Can I save the order of the elements in a java.util.List? In my application the order in which I put elements in the Lists is important but after I get those collections from the database the order is not the same (as expected). Can you show me a way to deal with this problem?
P.S. There is not a field in the entities that I put in the collections by which I can order them.
Rosen

There are some hacky ways of doing this in JPA 1, but it's easiest to switch to a JPA 2 provider. The #OrderColumn annotation support is what you're looking for. Eclipselink have an ok tutorial on how to use it.

JPA has 2 types of Lists. In JPA1 there is an "ordered list" (which is what you see, ordering defined by some SQL clause). In JPA2 you can have "ordered lists" or alternatively "indexed lists" (where the order of creation is preserved) ... the #OrderColumn referred to. Any implementation of JPA2 will have to support this e.g DataNucleus.
JDO has had indexed lists since day 1

You can save the order of the elements in a java.util.List. In JPA 2.0, There is the good way to save the order of element by using #OrderColumn annotation.
For the details, you can refer this link
Order Column (JPA 2.0)

Related

Hint HINT_PASS_DISTINCT_THROUGH reduces the amount of Entities returned per page for a PageRequest down to below the configured page size (PostgreSQL)

I'm setting up a JPA Specification based repository implementation that utilizes jpa specifications(constructed based on RSQL filter strings) to filter the results, define result ordering and remove any duplicates via "distinct" that would otherwise be returned due to joined tables. The JPA Specification builder method joins several tables and sets the "distinct" flag:
final Join<Object, Object> rootJoinedTags = root.join("tags", JoinType.LEFT);
final Join<Object, Object> rootJoinedLocations = root.join("location", JoinType.LEFT);
...
query.distinct(true);
To allow sorting by joined table columns, I've applied the "HINT_PASS_DISTINCT_THROUGH" hint to the relevant repository method(otherwise, sorting by joined table columns returns an error along the lines of "sort column must be included in the SELECT DISTINCT query").
#QueryHints(value = {
#QueryHint(name = org.hibernate.jpa.QueryHints.HINT_PASS_DISTINCT_THROUGH, value = "false")
})
Page<SomeEntity> findAll(#Nullable Specification<SomeEntity> spec, Pageable pageable);
The arguments for said repository method are constructed as such:
final Sort sort = getSort(searchFilter);
final Specification spec = getSpecificationIfPresent(searchFilter);
final PageRequest pageRequest = PageRequest.of(searchFilter.getPageNumber(), searchFilter.getLimit(), sort);
return eventRepository.findAll(spec, pageRequest);
After those changes, filtering and sorting seem to work as expected. However, the hint seems to cause "distinct" filtering to be applied after the result page is already constructed, thus reducing the number of returned entities in the page from the configured "size" PageRequest argument, to whatever is left after the duplicates are filtered out. For example, if we'd make a PageRequest with "page=0" and "pageSize=10", then the resulting Page may return only 5 "SomeEntity" instances, although the database contains way more entries(177 entities to be exact in this case). If I remove the hint, then the returned entities number is correct again.
Question: is there a way to make the same Specification query setup work with correctly sized Pages(some other hints that might be added to have duplicate filtering performed before the Page object is constructed)? If not, then is there another approach I could use to achieve the required Specification-based filtering, with joined-column sorting and duplicate removal as with "distinct"?
PS: PostgreSQL is the database behind the application in question
The problem you are experimenting have to do with the way you are using the HINT_PASS_DISTINCT_THROUGH hint.
This hint allows you to indicate Hibernate that the DISTINCT keyword should not be used in the SELECT statement issued against the database.
You are taking advantage of this fact to allow your queries to be sorted by a field that is not included in the DISTINCT column list.
But that is not how this hint should be used.
This hint only must be used when you are sure that there will be no difference between applying or not a DISTINCT keyword to the SQL SELECT statement, because the SELECT statement already will fetch all the distinct values per se. The idea is improve the performance of the query avoiding the use of an unnecessary DISTINCT statement.
This is usually what will happen when you use the query.distinct method in you criteria queries, and you are join fetching child relationships. This great article of #VladMihalcea explain how the hint works in detail.
On the other hand, when you use paging, it will set OFFSET and LIMIT - or something similar, depending on the underlying database - in the SQL SELECT statement issued against the database, limiting to a maximum number of results your query.
As stated, if you use the HINT_PASS_DISTINCT_THROUGH hint, the SELECT statement will not contain the DISTINCT keyword and, because of your joins, it could potentially give duplicate records of your main entity. This records will be processed by Hibernate to differentiate duplicates, because you are using query.distinct, and it will in fact remove duplicates if needed. I think this is the reason why you may get less records than requested in your Pageable.
If you remove the hint, as the DISTINCT keyword is passed in the SQL statement which is sent to the database, as far as you only project information of the main entity, it will fetch all the records indicated by LIMIT and this is why it will give you always the requested number of records.
You can try and fetch join your child entities (instead of only join with them). It will eliminate the problem of not being able to use the field you need to sort by in the columns of the DISTINCT keyword and, in addition, you will be able to apply, now legitimately, the hint.
But if you do so it will you another problem: if you use join fetch and pagination, to return the main entities and its collections, Hibernate will no longer apply pagination at database level - it will no include OFFSET or LIMIT keywords in the SQL statement, and it will try to paginate the results in memory. This is the famous Hibernate HHH000104 warning:
HHH000104: firstResult/maxResults specified with collection fetch; applying in memory!
#VladMihalcea explain that in great detail in the last part of this article.
He also proposed one possible solution to your problem, Window Functions.
In you use case, instead of using Specifications, the idea is that you implement your own DAO. This DAO only need to have access to the EntityManager, which is not a great deal as you can inject your #PersistenceContext:
#PersistenceContext
protected EntityManager em;
Once you have this EntityManager, you can create native queries and use window functions to build, based on the provided Pageable information, the right SQL statement that will be issued against the database. This will give you a lot of more freedom about what fields use for sorting or whatever you need.
As the last cited article indicates, Window Functions is a feature supported by all mayor databases.
In the case of PostgreSQL, you can easily come across them in the official documentation.
Finally, one more option, suggested in fact by #nickshoe, and explained in great detail in the article he cited, is to perform the sorting and paging process in two phases: in the first phase, you need to create a query that will reference your child entities and in which you will apply paging and sorting. This query will allow you to identify the ids of the main entities that will be used, in the second phase of the process, to obtain the main entities themselves.
You can take advantage of the aforementioned custom DAO to accomplish this process.
It may be an off-topic answer, but it may help you.
You could try to tackle this problem (pagination of parent-child entities) by separating the query in two parts:
a query for retrieving the ids that match the given criteria
a query for retrieving the actual entities by the resulting ids of the previous query
I came across this solution in this blog post: https://vladmihalcea.com/fix-hibernate-hhh000104-entity-fetch-pagination-warning-message/

how to search similar entities in database using Example class from hibernate

i know that there are an Hibernate class called Example that we can use to get similar entities in order to do a search, but is it possible that this class permit to get entities searching in a generic way.
I explain, I build an example entity having a property called name with value = "myname", is Hibernate capable to return an entity which has property having value = "mname" ?
Yes that's possible but to enable text-level similarity you need a Lucene index to speed-up the query, as it would otherwise be extremely inefficient to run on a relational database.
This is provided by Hibernate Search, the extension of Hibernate to integrate with Lucene and manage the indexes transparently.

List vs Set on JPA 2 - Pros / Cons / Convenience

I have tried searching on Stack Overflow and at other websites the pros, cons and conveniences about using Sets vs Lists but I really couldn't find a DEFINITE answer for when to use this or that.
From Hibernate's documentation, they state that non-duplicate records should go into Sets and, from there, you should implement your hashCode() and equals() for every single entity that could be wrapped into a Set. But then it comes to the price of convenience and ease of use as there are some articles that recommend the use of business-keys as every entity's id and, from there, hashCode() and equals() could then be perfectly implemented for every situation regardless of the object's state (managed, detached, etc).
It's all fine, all fine... until I come across on lots of situations where the use of Sets are just not doable, such as Ordering (though Hibernate gives you the idea of SortedSet), convenience of collectionObj.get(index), collectionObj.remove(int location || Object obj), Android's architecture of ListView/ExpandableListView (GroupIds, ChildIds) and on... My point is: Sets are just really bad (imho) to manipulate and make it work 100%.
I am tempted to change every single collection of my project to List as they work very well. The IDs for all my entities are generated through MYSQL's auto-generated sequence (#GeneratedValue(strategy = GenerationType.IDENTITY)).
Is there anyone out the who could in a definite way clear up my mind in all these little details mentioned above?
Also, is it doable to use Eclipse's auto-generated hashCode() and equals() for the ID field for every entity? Will it be effective in every situation?
Thank you very much,
Renato
List versus Set
Duplicates allowed
Lists allow duplicates and Sets do not allow duplicates. For some this will be the main reason for them choosing List or Set.
Multiple Bag's Exception - Multiple Eager fetching in same query
One notable difference in the handling of Hibernate is that you can't fetch two different lists in a single query.
It will throw an exception "cannot fetch multiple bags". But with sets, no such issues.
A list, if there is no index column specified, will just be handled as a bag by Hibernate (no specific ordering).
#OneToMany
#OrderBy("lastname ASC")
public List<Rating> ratings;
One notable difference in the handling of Hibernate is that you can't fetch two different lists in a single query. For example, if you have a Person entity having a list of contacts and a list of addresses, you won't be able to use a single query to load persons with all their contacts and all their addresses. The solution in this case is to make two queries (which avoids the cartesian product), or to use a Set instead of a List for at least one of the collections.
It's often hard to use Sets with Hibernate when you have to define equals and hashCode on the entities and don't have an immutable functional key in the entity.
furthermore i suggest you this link.

Order of items in List relationship

This may be super easy but I can't find any clear statement about it in JPA specs. If I have a List-based relationship without #OrderBy annotation, e.g:
#OneToMany
List<Child> children;
then what will be order of this list elements in Java? It seems reasonable that this will be order of corresponding records in Child table or entries in intermetiate table if it's many to many, but is that a guaranteed behavior of JPA providers?
Order is not guaranteed by the specification as far as I know. #OrderBy is the way to go if you depend on the order.
EDIT: Quote from JPA 1.0 spec:
Portable applications should not expect the order of lists to be maintained across persistence contexts unless the OrderBy construct is used and the modifications to the list observe the specified ordering. The order is not otherwise persistent.
(Page 19, Footnote [4])
Also, when your entity has no children, JPA specs don't specify if it should return an empty list or null when you retrieve your List, so be sure to check it to avoid nullpointerexceptions.
There is no guaranteed behavior, the order can be different each time. It happened to me with hibernate: the order changed when I refreshed the page.
You are able to order the search result when you search, not in the definition of the class and its properties.

Is it possible to remove order from Hibernate Criteria?

If I have an #OrderBy("someProperty") annotation on an object and then use a Criteria to add an ORDER BY clause like so:
criteria.addOrder(Order.asc("id"));
The resulting SQL will do the ordering like this:
ORDER BY someProperty, id asc
Is it possible to change the order of the two or to remove the someProperty order? I can't remove the #OrderBy annotation and I'm using Hibernate for Java.
Criteria has no methods for removal of Order neither Criterion
Order class is very limited, you can only use property names and it generates standard and portable SQL.
OrderBy annotation is a SQL order, as javadoc states: OrderBy
That means you can use there any sql (even a exclusive one of your database vendor). Take a look at these article: Sorting Collections in Hibernate Using SQL in #OrderBy
Adding a SQL fragment into your domain class isn't necessarily the most elegant thing in the world, but it might be the most pragmatic thing.
It may be possible to remove particular ordering via criteria.iterateOrderings() iterator, but I'm not sure how it works with annotations.

Categories