How to use natural sort with Spring Data Jpa

How to use natural sort with Spring Data Jpa - java

I have a table column that I want to order on. The problem is that the column value contains both numbers and text. For example, the result is now ordered like this.
1. group one
10. group ten
11. group eleven
2. group two
But I'd like the result to be ordered naturally, like this
1. group one
2. group two
10. group ten
11. group eleven
When looking at the Spring configuration I can't seem to find an option that allows you to do this. I use the Spring Pageable class to set my order field and direction, with an additional use of JPA specifications. The method itself returns a Page with the first 20 results by default.
I don't believe that Oracle supports Natural ordering out of the box so I have a stored procedure for that. Running the query using this procedure I get the desired result.
select ... from group order by NATURALSORT(group.name) asc
As you might expect I'd like to use that procedure by wrapping it around every ordered column that contains text. While maintaining to use pageables/pages and specifications. The research I done this far points me to a solution that might include
Either creating and implementing a custom Specification
Or extending the SimpleJpaRepository to change the way the Sort object is transformed.
But I didn't seem to find a method that allows me to set the order using native SQL. The only way I found to set the order was by calling orderBy and including an Order object.
So in general.
Is there a way to globally enable natural ordering when using Spring Data Jpa, hibernate and the Oracle database
If not, how can I wrap a single order by column with my stored procedure while still being able to use the Page findAll(Pageable, Specifications) method?
Thanks in advance.

After some digging into the source code of both Spring JPA and Hibernate I managed to find a solution to my problem. I'm pretty sure this isn't a nice way to solve it, but it's the only one I could find.
I ended up implementing a wrapper for the 'order by' part of the query by extending the SingularAttributePath class. This class has a render method that generates the string which gets inserted into the actual query. My implementation looks like this
#Override
public String render(RenderingContext renderingContext) {
String render = super.render(renderingContext);
render = "MYPACKAGE.NSORT(" + render + ")";
return render;
}
Next I extended the Order conversion functionality in the SimpleJpaRepository class. By default this is done by calling QueryUtils.toOrders(sort, root, builder). But since the method calling it was impossible to override I ended up calling the toOrder method myself and altering the result to my liking.
This means replacing all orders in the result by my custom implementation of the SingularAttributePath class. As an extra I extended the Sort class which is used by the Pageable class to have control over what gets wrapped and what doesn't (called NaturalOrder). But I'll get to that in a second. My implementation looks close to this (some checks are left out)
// Call the original method to convert the orders
List<Order> orders = QueryUtils.toOrders(sort, root, builder);
for (Order order : orders) {
// Fetch the original order object from the sort for comparing
SingularAttributePath orderExpression = (SingularAttributePath) order.getExpression();
Sort.Order originalOrder = sort.getOrderFor(orderExpression.getAttribute().getName());
// Check if the original order object is instantiated from my custom order class
// Also check if the the order should be natural
if (originalOrder instanceof NaturalSort.NaturalOrderm && ((NaturalSort.NaturalOrder) originalOrder).isNatural()){
// replace the order with the custom class
Order newOrder = new OrderImpl(new NaturalSingularAttributePathImpl(builder, expression.getJavaType(), expression.getPathSource(), expression.getAttribute()));
resultList.add(newOrder);
}else{
resultList.add(order);
}
}
return resultList;
The return list then gets added to the query by calling query.orderBy(resultlist). That's it for the back-end.
In order to control the wrap condition I also extended the Sort class used by the Pageable (mentioned this a few lines back). The only functionality I wanted to add was to have 4 types in the Direction enum.
ASC (default ascending)
DESC (default descending)
NASC (normal ascending)
NDESC (normal descending)
The last two values only act as placeholders. They set the isNatural boolean (variable of the extended Order class) which gets used in the condition. At the time they are converted to query they are mapped back to their default variants.
public Direction getNativeDirection() {
if (this == NaturalDirection.NASC)
return Direction.ASC;
if (this == NaturalDirection.NDESC)
return Direction.DESC;
return Direction.fromString(String.valueOf(this));
}
Lastly I replaced the SortHandlerMethodArgumentResolver used by the PageableHandlerMethodArgumentResolver. The only thing this does is creating instances of my NaturalSort class and passing them into the Pageable object , instead of the default Sort class.
In the end I'am able to call the same REST endpoint, while the result differs in the way it's sorted.
Default Sorting
/api/v1/items?page=0&size=20&sort=name,asc
Natural Sorting
/api/v1/items?page=0&size=20&sort=name,nasc
I hope this solution can help those who have the same or a derived problem regarding natural sort and spring JPA. If you have any question or improvements ,please let me know.

I am not aware of any such feature. You could however store the numbers in a separate column, and then order by that column, which should give a better sorting performance as an additional benefit.

Related

How to make a Hibernate SearchSession return results with unique attributes?

I am working on using the Hibernate SearchSession class in Java to perform a search against a database, the code I currently have to search a table looks something like this:
SearchSession searchSession = Search.session(entityManagerFactory.unwrap(SessionFactory.class).withOptions()
.tenantIdentifier("locations").openSession());
SearchResult<Location> result = searchSession.search(Location.class)
.where( f -> f.bool()
.must( f.match()
.field("locationName")
.matching((phrase)).fuzzy())
).fetch(page * limit, limit);
This search works and properly returns results from the database, but there is no uniqueness constraint on the locationName column and the database holds multiple records with the same value in locationName. As a result, when we try to display them on the UI of the application it looks like there are duplicate values, even though they're unique in the database.
Is there a way to make a SearchSession only return a result if another result with an identical value (such as locationName) has not been returned before? Applying a uniqueness constraint to the database table isn't an option in this scenario, and we were hoping there's a way to handle filtering out duplicate values in the session over taking the results from the search and removing duplicate values separately.

Is there a way to make a SearchSession only return a result if another result with an identical value (such as locationName) has not been returned before?
Not really, at least not at the moment.
If you're using the Elasticsearch backend and are fine with going native, you can insert native JSON into the Elasticsearch request, in particular collapsing.
I think something like this might work:
SearchResult<Location> result = searchSession.search( Location.class )
.extension( ElasticsearchExtension.get() )
.where( f -> f.bool()
.must( f.match()
.field("locationName")
.matching((phrase)).fuzzy())
)
.requestTransformer( context -> {
JsonObject collapse = new JsonObject();
collapse.addProperty("field", "locationName_keyword")
JsonObject body = context.body();
body.add( "collapse", collapse );
} )
// You probably need a sort, as well:
.sort(f -> f.field("id"))
.fetch( page * limit, limit );
You will need to add a locationName_keyword field to your Location entity:
#Indexed
#Entity
public class Location {
// ...
#Id
#GenericField(sortable = Sortable.YES) // Add this
private Long id;
// ...
#FullTextField
#KeywordField(name = "locationName_keyword", sortable = Sortable.YES) // Add this
private String locationName;
// ...
}
(You may need to also assign a custom normalizer to the locationName_keyword field, if the duplicate locations have a slightly different locationName (different case, ...))
Note however that the "total hit count" in the Search result will indicate the number of hits before collapsing. So if there's only one matching locationName, but 5 Location instances with that name, the total hit count will be 5, but users will only see one hit. They'll be confused for sure.
That being said, it might be worth having another look at your situation to determine whether collapsing is really necessary here:
As a result, when we try to display them on the UI of the application it looks like there are duplicate values, even though they're unique in the database.
If you have multiple documents with the same locationName, then surely you have multiple rows in the database with the same locationName? Duplication doesn't appear spontaneously when indexing.
I would say the first thing to do would be to step back, and consider whether you really want to query the Location entity, or if another, related entity wouldn't make more sense. When two locations have the same name, do they have a relationship to another, common entity instance (e.g. of type Shop, ...)?
=> If so, you should probably query that entity type instead (.search(Shop.class)), and take advantage of #IndexedEmbedded to allow filtering based on Location properties (i.e. add #IndexedEmbedded to the location association in the Shop entity type, then use the field location.locationName when adding a predicate that should match the location name).
If there is no such related, common entity instance, then I would try to find out why locations are duplicated exactly, and more importantly why that duplication makes sense in the database, but not to users:
Are the users not interested in all the locations? Then maybe you should add another filter to your query (by "type", ...) that would help remove duplicates. If necessary, you could even run multiple search queries: first one with very strict filters, and if there are no hits, fall back to another one with less strict filters.
Are you using some kind of versioning or soft deletion? Then maybe you should avoid indexing soft-deleted entities or older versions; you can do that with conditional indexing or, if that doesn't work, with a filter in your search query.
If your data really is duplicated (legacy database, ...) without any way to pick a duplicate over another except by "just picking the first one", you could consider whether you need an aggregation instead of full-blown search. Are you just looking for the top location names, or maybe a count of locations by name? Then aggregations are the right tool.

How does Spring Data decide what is returned by FindBy

How does Spring Data's findBy decide which database record to return if there are multiple matches?
I realised if I have more than one entry in my Elastic Search database with the same attribute code (ie: "123"), Spring only returns one entry when I call a 'findByAttributeCode'.
If I use a findById, its self explanatory as Id's are unique, however with other findBys, there can be many matches. Note: attributeCode is NOT unique.
How does Spring decide which one to return?
My call would be something like this:
Attribute attribute = findByAttribute(attributeCode);
The repo would look like this:
public interface AttributeRepository extends ElasticsearchRepository<Attribute, String> {
Attribute findByAttributeCode(String attributeCode);
}

This is taken from the return type that you define for your function. If you specify a collection, all matching documents are returned for your query.
If you define a single object as return type, the first entry returned from the underlying store - here Elasticsearch - is returned. What this first entry is, depends on your query criteria, sort parameters - whatever Elasticsearch returns first, is returned to the caller.

What you should be doing, if there are more than one possibility is creating the method stub like this:
<Iterable>Attribute findByAttributeCode(String attributeCode);
This way you return them all. If you don't do that, you are beholden to the RDBMS in how it builds it swap to return a single entry from the multiple tuples it will return from the query it builds, which should be something like:
select * from table where attributeCode = ?;

(Spring boot) can Optional<> Class be like List<> Class?

Im trying to put the RoomEntity Class in the List as its generic parameter but the List Class turns red(Error) and the only thing that it suggests is for me to change the List Class to Optional Class.
public interface RoomRepository extends CrudRepository<RoomEntity, Long> {
List<RoomEntity> findById(Long id);
}
RoomEntity Class
#Entity
#Table(name = "Room")
public class RoomEntity {
}
are they the same?
List<RoomEntity> findById(Long id);
Optional<RoomEntity> findById(Long id);

Optional and List are two very different concepts.
The CrudRepository findAllById method returns an Iterable.
The findById method returns an Optional.
An iterable can be used to access one by one the elements of a collection and so can be used in conjunction with the List class.
An Optional is a container object which may or may not contain a non-null value (zero or one elements). This will have a single element in it, not many elements like in a List, if there is one.
The CrudRepository::findAllById can have more than one ID sent to it and so can return more than one element, the way it does this is to return an iterable you can use to select each of the returned results one by one. The findById method can only be sent a single ID and so returns that element if it is present (wrapped in an Optional), or an Optional.none if it is not.
If you are looking for a list to be returned because you intend to send in multiple IDs then use the findAllById method. If you want a specific element by ID only, then use the findById method but then you will have to unwrap the Optional object it is returned in before you can use it outside of a stream pipeline using Optional.get, Optional.isPresent, or using a map or flatmap call over it in a streams pipeline.

Spring data JPA will fit the query result to your desired container
You ask a List<>, Spring will initialize a list and add any row data to that list and return it for you. Hence it will:
Return empty list if no items found
Return populated list if items found
When you ask an Optional<>, Spring will understand that you want at most one row data. It will interpreted as getSingleResult() on javax.persistence.Query. Hence it will:
Return Optional.empty() if no items found
Return Optional.of(result) if exactly one match
Throw exceptions if there are more than one match (The one I remember is NonUniqueResultException)
In your case, you find by id. It's unique on your table so Optional<> should fit your purpose.
But note that your List<RoomEntity> findById(Long id); definition is correct and it won't give you compiler error (turn red). Have you imported the List interface?

The findById method is supposed to look for a single entity by it’s id. After all, ids are unique for every entity.
You can try to use findAllById,
but I doubt it’ll make much difference.
What Optional means is that there may or may not be a result. The isPresent method of Optional will indicate this.

Your findById by definition should always return 1 or 0 entities(according to documentation for spring data method naming), as your id is a unique key, and there cannot be more then one entry in your repository with such key value. So Optional suits perfectly well for this situation, because its either empty(no entry with such key in repository) or present with specific value(there is entry in repository). If you want to query all entities by some not unique key, lets say name column, you can name your method findByName, with return value of Iterable<Entity>, thus when generating implementation for your repository spring will understand that there can be more than 1 entity in result set.
Method findById is already predefined in interface you are extending, so you couldn't change it return type anyway.
This also might be usefull: https://docs.spring.io/spring-data/jpa/docs/current/reference/html/#repositories.core-concepts

Spring Data JPA Distinct - Return results from a single column

I have some data that contains a STATE field (String/Text) that defines what state a given request is currently in (e.g. pending, approved denied etc.) and to get all the unique values from that column I can run the following TSQL query
SELECT DISTINCT STATE FROM CALLOUT_REQUEST
where CALLOUT_REQUEST is my table name and STATE being the field which returns something like:
STATE
approved
denied
pending
...
However I don't understand how I would turn that into a query in my repository as it seems I need a "by" statement or some other filter mechanism which i can get the STATE based on?
What I am looking to return - as shown in the raw TSQL query above - is some kind of List or Array object which contains all the unique/distinct values in all of the STATE fields.
So in pseudo code i think i am looking for something like this:
String[] states = repository.findDisinctState();
where findDistinctState() would then return an array of sorts.
Hope that makes sense - I am very new to Java and Spring in general so I think I am missing some conceptual knowledge to utilise the above.
UPDATE:
The 'state' concept is closed so i could implement that as an enum - only problem is i dont know how to do that :) Ill look into how i can do that as i think it fits perfectly with what i am trying to achieve.
The List i get from the query provided is intended to be used to get a count of all the occurrences. I had this code before to get a total count for each of the 'states':
Map stats = new HashMap();
String[] states = {"approved", "denied", "pending", "deactivated"};
for (int i = 0; i < states.length; i++) {
stats.put(states[i], repository.countByState(states[i]));
}
Am i correct in understanding that the states Array that i have in the above code snippet could be turned into an enum and then i dont even need the custom #Query anymore?

If that state concept is closed - you know its possible set of values - it should be an enum.
After that you can create queries that you invoke like:
repository.findByState(State.APPROVED)
If you can't create an enum, you need a separate method to get the distinct values, which can't be provided by JPA, because you need a list of strings and not a list of CalloutRequests.
Then you need to specify a query manually like:
#Query("SELECT DISTINCT State FROM CALLOUT_REQUEST")
List<String> findDistinctStates();

You can use a JPQL query for this, with the #org.springframework.data.jpa.repository.Query annotation:
#Query("select distinct state from CalloutRequest")
List<String> findDistinctStates();

If you don't want to use #Query then one solution is there to create an interface "StateOnlyInterface" with method named "getState()".
Then create method in your repo with name, getDistinctState(). Return type of this method to be kept as ArrayList of StateOnlyInterface.

How to sort by multiple properties in Spring Data (JPA) derived queries?

I'm looking at the examples giving on this page (https://docs.spring.io/spring-data/jpa/docs/current/reference/html/#jpa.repositories) about method naming, is it possible to create a complex chain method name such as
findByProgrammeAndDirectorAndProgDateBetweenOrderByProgDateStartTimeAsc
In the example they give, they are only doing an OrderBy on one value. In the example above ProgDate and StartTime would be two separate values.

The trick is to simply delimit the properties you want to sort by using the direction keywords Asc and Desc. So what you probably want in your query method is something like:
…OrderByProgDateAscStartTimeAsc
Note, how we conclude the first property definition by Asc and keep going with the next property.
Generally speaking, we recommend switching to #Query based queries, once method names exceed a certain length or complexity. The main reason being that it's awkward for clients to call these very long methods. With #Query you rather get the full power of the query language plus a reasonably sized method name that might be of higher level language to express the intent of the query.

I am Sharing one other approach code snippet for implementing get operation where performing sort operation ordered by multiple column
List<Order> orders = new ArrayList<Order>();
Order StartTimeOrder = new Order(Sort.Direction.DESC, "StartTime");
orders.add(StartTimeOrder);
Order progDateOrder = new Order(Sort.Direction.ASC, "ProgDate");
orders.add(progDateOrder);
return repository.findAll(Sort.by(orders));

Yes it's should be possible:
Try this:
findByProgrammeAndDirectorAndProgDateBetweenOrderByProgDateStartTimeAsc(String programme, String director, Date progStart, Date progEnd);
I have not tested the code, but according to things I've already done, it should work.

A bit more compact :
return repository.findAll(
Sort.by(List.of(
new Order(Sort.Direction.DESC, "StartTime"),
new Order(Sort.Direction.ASC, "ProgDate")
))
);
or
return repository.findAll(
Sort
.by(Direction.DESC, "StartTime")
.and(Sort.by(Sort.Direction.ASC, "ProgDate"))
);

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.