Hibernate PostUpdateEvent getState and getOldState order of values - java

A little backstory. I am working on a java project, using spring data, and i need to log all changes made to all entities and what type of change (event type) it is (eg. INSERT, UPDATE, DELETE) in mongodb, in an automated way.
For this i am using hibernate postEventListeners (PostInsertListener, PostUpdateListener and PostDeleteListener). This was all good, but now a change has been made to the original requirement, and i need to create a few more event types ( for example LOGIN ).
To create the LOGIN event, without completely altering the existing code, i found that i can just have a simple check, to see if the entity that i'm processing is a User and if the only property that is changed is lastLogin.
if (entity instanceof User) {
if(updateEvent.getDirtyProperties().length == 1 && updateEvent.getDirtyProperties()[0] == 8)
history.setEventType(HistoryEvent.LOGIN);
}
updateEvent is an instace of PostUpdateEvent, from the onPostUpdate function.
This is working fine, but my current implementation is not ideal. In hibernate, getState() and getOldState() return and Object[] that contains all the properties of the object that is being updated. getDirtyProperties() returns the an array of indexes, indexes of only the properties which are not the same in the getState and getOldState arrays.
The problem that i have is that the Object[] returned by getState and getOldState contains only the values of the properties, and i can't figure out the order that they are in. For now i just hardcoded the index, but this solution is not ideal, because if i add/remove a property from the User class, the index also gets modified, and i have to find out what the new index is and change it.
My question is, what order are the properties in the Object[], or how can i change my code so that the value is not hardcoded? Is there a way to get a pair, of property value instead of getting just an array of values?

So i actually found the answer on a hibernate forum.
I'l leave the link to give credit to Vlad Mihalcea
Answer here
To get the property names use this:
String[] properties = event.getPersister().getPropertyNames();
Then match the array indexes and you’ll know what property has
changed.

Related

How to make a Hibernate SearchSession return results with unique attributes?

I am working on using the Hibernate SearchSession class in Java to perform a search against a database, the code I currently have to search a table looks something like this:
SearchSession searchSession = Search.session(entityManagerFactory.unwrap(SessionFactory.class).withOptions()
.tenantIdentifier("locations").openSession());
SearchResult<Location> result = searchSession.search(Location.class)
.where( f -> f.bool()
.must( f.match()
.field("locationName")
.matching((phrase)).fuzzy())
).fetch(page * limit, limit);
This search works and properly returns results from the database, but there is no uniqueness constraint on the locationName column and the database holds multiple records with the same value in locationName. As a result, when we try to display them on the UI of the application it looks like there are duplicate values, even though they're unique in the database.
Is there a way to make a SearchSession only return a result if another result with an identical value (such as locationName) has not been returned before? Applying a uniqueness constraint to the database table isn't an option in this scenario, and we were hoping there's a way to handle filtering out duplicate values in the session over taking the results from the search and removing duplicate values separately.
Is there a way to make a SearchSession only return a result if another result with an identical value (such as locationName) has not been returned before?
Not really, at least not at the moment.
If you're using the Elasticsearch backend and are fine with going native, you can insert native JSON into the Elasticsearch request, in particular collapsing.
I think something like this might work:
SearchResult<Location> result = searchSession.search( Location.class )
.extension( ElasticsearchExtension.get() )
.where( f -> f.bool()
.must( f.match()
.field("locationName")
.matching((phrase)).fuzzy())
)
.requestTransformer( context -> {
JsonObject collapse = new JsonObject();
collapse.addProperty("field", "locationName_keyword")
JsonObject body = context.body();
body.add( "collapse", collapse );
} )
// You probably need a sort, as well:
.sort(f -> f.field("id"))
.fetch( page * limit, limit );
You will need to add a locationName_keyword field to your Location entity:
#Indexed
#Entity
public class Location {
// ...
#Id
#GenericField(sortable = Sortable.YES) // Add this
private Long id;
// ...
#FullTextField
#KeywordField(name = "locationName_keyword", sortable = Sortable.YES) // Add this
private String locationName;
// ...
}
(You may need to also assign a custom normalizer to the locationName_keyword field, if the duplicate locations have a slightly different locationName (different case, ...))
Note however that the "total hit count" in the Search result will indicate the number of hits before collapsing. So if there's only one matching locationName, but 5 Location instances with that name, the total hit count will be 5, but users will only see one hit. They'll be confused for sure.
That being said, it might be worth having another look at your situation to determine whether collapsing is really necessary here:
As a result, when we try to display them on the UI of the application it looks like there are duplicate values, even though they're unique in the database.
If you have multiple documents with the same locationName, then surely you have multiple rows in the database with the same locationName? Duplication doesn't appear spontaneously when indexing.
I would say the first thing to do would be to step back, and consider whether you really want to query the Location entity, or if another, related entity wouldn't make more sense. When two locations have the same name, do they have a relationship to another, common entity instance (e.g. of type Shop, ...)?
=> If so, you should probably query that entity type instead (.search(Shop.class)), and take advantage of #IndexedEmbedded to allow filtering based on Location properties (i.e. add #IndexedEmbedded to the location association in the Shop entity type, then use the field location.locationName when adding a predicate that should match the location name).
If there is no such related, common entity instance, then I would try to find out why locations are duplicated exactly, and more importantly why that duplication makes sense in the database, but not to users:
Are the users not interested in all the locations? Then maybe you should add another filter to your query (by "type", ...) that would help remove duplicates. If necessary, you could even run multiple search queries: first one with very strict filters, and if there are no hits, fall back to another one with less strict filters.
Are you using some kind of versioning or soft deletion? Then maybe you should avoid indexing soft-deleted entities or older versions; you can do that with conditional indexing or, if that doesn't work, with a filter in your search query.
If your data really is duplicated (legacy database, ...) without any way to pick a duplicate over another except by "just picking the first one", you could consider whether you need an aggregation instead of full-blown search. Are you just looking for the top location names, or maybe a count of locations by name? Then aggregations are the right tool.

How to use natural sort with Spring Data Jpa

I have a table column that I want to order on. The problem is that the column value contains both numbers and text. For example, the result is now ordered like this.
1. group one
10. group ten
11. group eleven
2. group two
But I'd like the result to be ordered naturally, like this
1. group one
2. group two
10. group ten
11. group eleven
When looking at the Spring configuration I can't seem to find an option that allows you to do this. I use the Spring Pageable class to set my order field and direction, with an additional use of JPA specifications. The method itself returns a Page with the first 20 results by default.
I don't believe that Oracle supports Natural ordering out of the box so I have a stored procedure for that. Running the query using this procedure I get the desired result.
select ... from group order by NATURALSORT(group.name) asc
As you might expect I'd like to use that procedure by wrapping it around every ordered column that contains text. While maintaining to use pageables/pages and specifications. The research I done this far points me to a solution that might include
Either creating and implementing a custom Specification
Or extending the SimpleJpaRepository to change the way the Sort object is transformed.
But I didn't seem to find a method that allows me to set the order using native SQL. The only way I found to set the order was by calling orderBy and including an Order object.
So in general.
Is there a way to globally enable natural ordering when using Spring Data Jpa, hibernate and the Oracle database
If not, how can I wrap a single order by column with my stored procedure while still being able to use the Page findAll(Pageable, Specifications) method?
Thanks in advance.
After some digging into the source code of both Spring JPA and Hibernate I managed to find a solution to my problem. I'm pretty sure this isn't a nice way to solve it, but it's the only one I could find.
I ended up implementing a wrapper for the 'order by' part of the query by extending the SingularAttributePath class. This class has a render method that generates the string which gets inserted into the actual query. My implementation looks like this
#Override
public String render(RenderingContext renderingContext) {
String render = super.render(renderingContext);
render = "MYPACKAGE.NSORT(" + render + ")";
return render;
}
Next I extended the Order conversion functionality in the SimpleJpaRepository class. By default this is done by calling QueryUtils.toOrders(sort, root, builder). But since the method calling it was impossible to override I ended up calling the toOrder method myself and altering the result to my liking.
This means replacing all orders in the result by my custom implementation of the SingularAttributePath class. As an extra I extended the Sort class which is used by the Pageable class to have control over what gets wrapped and what doesn't (called NaturalOrder). But I'll get to that in a second. My implementation looks close to this (some checks are left out)
// Call the original method to convert the orders
List<Order> orders = QueryUtils.toOrders(sort, root, builder);
for (Order order : orders) {
// Fetch the original order object from the sort for comparing
SingularAttributePath orderExpression = (SingularAttributePath) order.getExpression();
Sort.Order originalOrder = sort.getOrderFor(orderExpression.getAttribute().getName());
// Check if the original order object is instantiated from my custom order class
// Also check if the the order should be natural
if (originalOrder instanceof NaturalSort.NaturalOrderm && ((NaturalSort.NaturalOrder) originalOrder).isNatural()){
// replace the order with the custom class
Order newOrder = new OrderImpl(new NaturalSingularAttributePathImpl(builder, expression.getJavaType(), expression.getPathSource(), expression.getAttribute()));
resultList.add(newOrder);
}else{
resultList.add(order);
}
}
return resultList;
The return list then gets added to the query by calling query.orderBy(resultlist). That's it for the back-end.
In order to control the wrap condition I also extended the Sort class used by the Pageable (mentioned this a few lines back). The only functionality I wanted to add was to have 4 types in the Direction enum.
ASC (default ascending)
DESC (default descending)
NASC (normal ascending)
NDESC (normal descending)
The last two values only act as placeholders. They set the isNatural boolean (variable of the extended Order class) which gets used in the condition. At the time they are converted to query they are mapped back to their default variants.
public Direction getNativeDirection() {
if (this == NaturalDirection.NASC)
return Direction.ASC;
if (this == NaturalDirection.NDESC)
return Direction.DESC;
return Direction.fromString(String.valueOf(this));
}
Lastly I replaced the SortHandlerMethodArgumentResolver used by the PageableHandlerMethodArgumentResolver. The only thing this does is creating instances of my NaturalSort class and passing them into the Pageable object , instead of the default Sort class.
In the end I'am able to call the same REST endpoint, while the result differs in the way it's sorted.
Default Sorting
/api/v1/items?page=0&size=20&sort=name,asc
Natural Sorting
/api/v1/items?page=0&size=20&sort=name,nasc
I hope this solution can help those who have the same or a derived problem regarding natural sort and spring JPA. If you have any question or improvements ,please let me know.
I am not aware of any such feature. You could however store the numbers in a separate column, and then order by that column, which should give a better sorting performance as an additional benefit.

Spring Data JPA Distinct - Return results from a single column

I have some data that contains a STATE field (String/Text) that defines what state a given request is currently in (e.g. pending, approved denied etc.) and to get all the unique values from that column I can run the following TSQL query
SELECT DISTINCT STATE FROM CALLOUT_REQUEST
where CALLOUT_REQUEST is my table name and STATE being the field which returns something like:
STATE
approved
denied
pending
...
However I don't understand how I would turn that into a query in my repository as it seems I need a "by" statement or some other filter mechanism which i can get the STATE based on?
What I am looking to return - as shown in the raw TSQL query above - is some kind of List or Array object which contains all the unique/distinct values in all of the STATE fields.
So in pseudo code i think i am looking for something like this:
String[] states = repository.findDisinctState();
where findDistinctState() would then return an array of sorts.
Hope that makes sense - I am very new to Java and Spring in general so I think I am missing some conceptual knowledge to utilise the above.
UPDATE:
The 'state' concept is closed so i could implement that as an enum - only problem is i dont know how to do that :) Ill look into how i can do that as i think it fits perfectly with what i am trying to achieve.
The List i get from the query provided is intended to be used to get a count of all the occurrences. I had this code before to get a total count for each of the 'states':
Map stats = new HashMap();
String[] states = {"approved", "denied", "pending", "deactivated"};
for (int i = 0; i < states.length; i++) {
stats.put(states[i], repository.countByState(states[i]));
}
Am i correct in understanding that the states Array that i have in the above code snippet could be turned into an enum and then i dont even need the custom #Query anymore?
If that state concept is closed - you know its possible set of values - it should be an enum.
After that you can create queries that you invoke like:
repository.findByState(State.APPROVED)
If you can't create an enum, you need a separate method to get the distinct values, which can't be provided by JPA, because you need a list of strings and not a list of CalloutRequests.
Then you need to specify a query manually like:
#Query("SELECT DISTINCT State FROM CALLOUT_REQUEST")
List<String> findDistinctStates();
You can use a JPQL query for this, with the #org.springframework.data.jpa.repository.Query annotation:
#Query("select distinct state from CalloutRequest")
List<String> findDistinctStates();
If you don't want to use #Query then one solution is there to create an interface "StateOnlyInterface" with method named "getState()".
Then create method in your repo with name, getDistinctState(). Return type of this method to be kept as ArrayList of StateOnlyInterface.

Return type of multi-valued Entity properties

I'm using App Engine's Datastore entities in my current project, and I have a multi-valued property for one of the entities. Now, my question is simple, if I store String objects as the values in the multi-value property by passing a String ArrayList as the value in the setProperty("myPropertyName", myArrayList) of my entity, what object will I receive when I run the following:
myEntity.getProperty("myPropertyName");
From my observation it doesn't seem to return an ArrayList, even though ArrayList is a Collection and, according to the documentation, getProperty() returns a Collection object.
The list of supported types can be found here: https://developers.google.com/appengine/docs/java/datastore/entities#Java_Properties_and_value_types.
Strongly consider using a JSON string as GAEfan suggested.
Edit:
According to the OP's comment below, you can store and get multiple values in the datastore as follows:
myEntity.setProperty("myPropertyName", myArrayListOfStrings)
List<String> myValues = myEntity.getProperty("myValueName");

Efficiently finding duplicates in a constrained many-to-many dataset?

I have to write a bulk operation version of something our webapp
lets you do on a more limited basis from the UI. The desired
operation is to assign objects to a category. A category can have
multiple objects but a given object can only be in one category.
The workflow for the task is:
1) Using the browser, a file of the following form is uploaded:
# ObjectID, CategoryID
Oid1, Cid1
Oid2, Cid1
Oid3, Cid2
Oid4, Cid2
[etc.]
The file will most likely have tens to hundreds of lines, but
definitely could have thousands of lines.
In an ideal world a given object id would only occur once in the file
(reflecting the fact that an object can only be assigned to one category)
But since the file is created outside of our control, there's no guarantee
that's actually true and the processing has to deal with that possibility.
2) The server will receive the file, parse it, pre-process it
and show a page something like:
723 objects to be assigned to 126 categories
142 objects not found
42 categories not found
Do you want to continue?
[Yes] [No]
3) If the user clicks the Yes button, the server will
actually do the work.
Since I don't want to parse the file in both steps (2) and (3), as
part of (2), I need to build a container that will live across
requests and hold a useful representation of the data that will let me
easily provide the data to populate the "preview" page and will let me
efficiently do the actual work. (While obviously we have sessions, we
normally keep very little in-memory session state.)
There is an existing
assignObjectsToCategory(Set<ObjectId> objectIds, CategoryId categoryId)
function that is used when assignment is done through the UI. It is
highly desireable for the bulk operation to also use this API since it
does a bunch of other business logic in addition to the simple
assignment and we need that same business logic to run when this bulk
assign is done.
Initially it was going to be OK that if the file "illegally" specified
multiple categories for a given object -- it would be OK to assign the
object abitrarily to one of the categories the file associated it
with.
So I was initially thinking that in step (2) as I went through the
file I would build up and put into the cross-request container a
Map<CategoryId, Set<ObjectId>> (specifically a HashMap for quick
lookup and insertion) and then when it was time to do the work I could
just iterate on the map and for each CategoryId pull out the
associated Set<ObjectId> and pass them into assignObjectsToCategory().
However, the requirement on how to handle duplicate ObjectIds changed.
And they are now to be handled as follows:
If an ObjectId appears multiple times in the file and
all times is associated with the same CategoryId, assign
the object to that category.
If an ObjectId appears multiple times in the file and
is associated with different CategoryIds, consider that
an error and make mention of it on the "preview" page.
That seems to mess up my Map<CategoryId, Set<ObjectId>> strategy
since it doesn't provide a good way to detect that the ObjectId I
just read out of the file is already associated with a CategoryId.
So my question is how to most efficiently detect and track these
duplicate ObjectIds?
What came to mind is to use both "forward" and "reverse" maps:
public CrossRequestContainer
{
...
Map<CategoryId, Set<ObjectId>> objectsByCategory; // HashMap
Map<ObjectId, List<CategoryId>> categoriesByObject; // HashMap
Set<ObjectId> illegalDuplicates;
...
}
Then as each (ObjectId, CategoryId) pair was read in, it would
get put into both maps. Once the file was completely read in, I
could do:
for (Map.Entry<ObjectId, List<CategoryId>> entry : categoriesByObject.entrySet()) {
List<CategoryId> categories = entry.getValue();
if (categories.size() > 1) {
ObjectId object = entry.getKey();
if (!all_categories_are_equal(categories)) {
illegalDuplicates.add(object);
// Since this is an "illegal" duplicate I need to remove it
// from every category that it appeared with in the file.
for (CategoryId category : categories) {
objectsByCategory.get(category).remove(object);
}
}
}
}
When this loop finishes, objectsByCategory will no longer contain any "illegal"
duplicates, and illegalDuplicates will contain all the "illegal" duplicates to
be reported back as needed. I can then iterate over objectsByCategory, get the Set<ObjectId> for each category, and call assignObjectsToCategory() to do the assignments.
But while I think this will work, I'm worried about storing the data twice, especially
when the input file is huge. And I'm also worried that I'm missing something re: efficiency
and this will go very slowly.
Are there ways to do this that won't use double memory but can still run quickly?
Am I missing something that even with the double memory use will still run a lot
slower than I'm expecting?
Given the constraints you've given, I don't there's a way to do this using a lot less memory.
One possible optimization though is to only maintain lists of categories for objects which are listed in multiple categories, and otherwise just map object to category, ie:
Map<CategoryId, Set<ObjectId>> objectsByCategory; // HashMap
Map<ObjectId, CategoryId> categoryByObject; // HashMap
Map<ObjectId, Set<CategoryId>> illegalDuplicates; // HashMap
Yes, this adds yet another container, but it will contain (hopefully) only a few entries; also, the memory requirements of the categoryByObject map is reduced (cutting out one list overhead per entry).
The logic is a little more complicated of course. When a duplicate is initially discovered, the object should be removed from the categoryByObject map and added into the illegalDuplicates map. Before adding any object into the categoryByObject map, you will need to first check the illegalDuplicates map.
Finally, it probably won't hurt performance to build the objectsByCategory map in a separate loop after building the other two maps, and it will simplify the code a bit.

Categories