I have a performance problem with a hibernate implementation that is far to performance costly.
I will try to explain my current implementation which must be improved upon with pseudo classes.
Let’s say I have the following POJO classes (the Entity classes are hibernate annotated "copies").
Country.java and CountryEntity.java
City.javaand CityEntity.java
Inhabitant.javaand InhabitantEntity.java
And I want to add a city to a country and save/persist it in the database, the new city arrives fully populated as a POJO.
Current code
CountryEntity countryEntity = CountryDao.fetch(someId);
Country country = CountryConverter(countryEnity);
country.getCities.add(newCity);
countryEnity = CountryEntityConverter(country);
CountryDao.save(countryEnity);
This results in a major performance problem. Let's say I have 200 cities with 10,000 inhabitants.
For me to add a new city the converter will convert 200 x 10,000 = 2,000,000 inhabitantEntity --> inhabitant --> inhabitantEntity
This puts a tremendous load on the server, as new cities are added often.
It also feels unnecessary to convert all cities in the country just to persist and connect another one.
I am thinking of creating a light converter which doesn't convert all the fields and only the ones I need for some business logic during the addition of the city, but those will be kept unchanged, I don't know if Hibernate is good enough to handle this scenario.
For example if I save an entity with alot of null fields and the list cities with only one city, can I tell hibernate to merge this together with the db.
Or is there a different approace I can take to solve the performance problem but keeping the POJO and Entitys separate?
Some code below showing my current "slow" implementation code.
Country.Java (pseudo code)
private fields
private List<City> cities;
City.Java (pseudo code)
private fields
private List<Inhabitant> inhabitants;
Inhabitant.Java (pseudo code)
private fields
Currently I fetch a CountryEnity thru a Dao java class.
Then I have converter classes (Entities --> POJO) that sets all fields and initiate all lists.
I also have similar converter classes converting (POJO --> Entities).
CountryConverter(countryEntity)
Country country = new Country();
Country.setField(countryEntity.getField())
Loop thru cityEnitites
Country.getCities.add(CityConverter(cityEntity))
return country
CityConverter(cityEntity)
City city = new City()
city.setField(cityEntity.getField())
Loop thru inhabitantEnitites
city.getInhabitants.add(InhabitantConverter(inhabitantEntity))
return country
InhabitantConverter(inhabitantEntity)
Inhabitant inhabitant = new Inhabitant()
inhabitant.setField(inhabitantEntity.getField())
return inhabitant
Thanks in advance /Farmor
I suspect what might be happening is that you don't have an index column on the association, so Hibernate is deleting and then inserting the child collection, as opposed to just adding to or deleting discrete objects to and from the child association.
If that is what's going on, you could try adding an #IndexColumn annotation to the get method for the child association. That will then allow Hibernate to perform discrete inserts, updates, and deletes on association records, as opposed to having to delete and then re-insert. You would then be able to insert the new city and its new inhabitants without having to rebuild everything.
Related
I am working on using the Hibernate SearchSession class in Java to perform a search against a database, the code I currently have to search a table looks something like this:
SearchSession searchSession = Search.session(entityManagerFactory.unwrap(SessionFactory.class).withOptions()
.tenantIdentifier("locations").openSession());
SearchResult<Location> result = searchSession.search(Location.class)
.where( f -> f.bool()
.must( f.match()
.field("locationName")
.matching((phrase)).fuzzy())
).fetch(page * limit, limit);
This search works and properly returns results from the database, but there is no uniqueness constraint on the locationName column and the database holds multiple records with the same value in locationName. As a result, when we try to display them on the UI of the application it looks like there are duplicate values, even though they're unique in the database.
Is there a way to make a SearchSession only return a result if another result with an identical value (such as locationName) has not been returned before? Applying a uniqueness constraint to the database table isn't an option in this scenario, and we were hoping there's a way to handle filtering out duplicate values in the session over taking the results from the search and removing duplicate values separately.
Is there a way to make a SearchSession only return a result if another result with an identical value (such as locationName) has not been returned before?
Not really, at least not at the moment.
If you're using the Elasticsearch backend and are fine with going native, you can insert native JSON into the Elasticsearch request, in particular collapsing.
I think something like this might work:
SearchResult<Location> result = searchSession.search( Location.class )
.extension( ElasticsearchExtension.get() )
.where( f -> f.bool()
.must( f.match()
.field("locationName")
.matching((phrase)).fuzzy())
)
.requestTransformer( context -> {
JsonObject collapse = new JsonObject();
collapse.addProperty("field", "locationName_keyword")
JsonObject body = context.body();
body.add( "collapse", collapse );
} )
// You probably need a sort, as well:
.sort(f -> f.field("id"))
.fetch( page * limit, limit );
You will need to add a locationName_keyword field to your Location entity:
#Indexed
#Entity
public class Location {
// ...
#Id
#GenericField(sortable = Sortable.YES) // Add this
private Long id;
// ...
#FullTextField
#KeywordField(name = "locationName_keyword", sortable = Sortable.YES) // Add this
private String locationName;
// ...
}
(You may need to also assign a custom normalizer to the locationName_keyword field, if the duplicate locations have a slightly different locationName (different case, ...))
Note however that the "total hit count" in the Search result will indicate the number of hits before collapsing. So if there's only one matching locationName, but 5 Location instances with that name, the total hit count will be 5, but users will only see one hit. They'll be confused for sure.
That being said, it might be worth having another look at your situation to determine whether collapsing is really necessary here:
As a result, when we try to display them on the UI of the application it looks like there are duplicate values, even though they're unique in the database.
If you have multiple documents with the same locationName, then surely you have multiple rows in the database with the same locationName? Duplication doesn't appear spontaneously when indexing.
I would say the first thing to do would be to step back, and consider whether you really want to query the Location entity, or if another, related entity wouldn't make more sense. When two locations have the same name, do they have a relationship to another, common entity instance (e.g. of type Shop, ...)?
=> If so, you should probably query that entity type instead (.search(Shop.class)), and take advantage of #IndexedEmbedded to allow filtering based on Location properties (i.e. add #IndexedEmbedded to the location association in the Shop entity type, then use the field location.locationName when adding a predicate that should match the location name).
If there is no such related, common entity instance, then I would try to find out why locations are duplicated exactly, and more importantly why that duplication makes sense in the database, but not to users:
Are the users not interested in all the locations? Then maybe you should add another filter to your query (by "type", ...) that would help remove duplicates. If necessary, you could even run multiple search queries: first one with very strict filters, and if there are no hits, fall back to another one with less strict filters.
Are you using some kind of versioning or soft deletion? Then maybe you should avoid indexing soft-deleted entities or older versions; you can do that with conditional indexing or, if that doesn't work, with a filter in your search query.
If your data really is duplicated (legacy database, ...) without any way to pick a duplicate over another except by "just picking the first one", you could consider whether you need an aggregation instead of full-blown search. Are you just looking for the top location names, or maybe a count of locations by name? Then aggregations are the right tool.
i have the following entity :
//metadata...
public class Article{
//properties...
private Set<Field> fields;
#OneToMany(cascade = CascadeType.ALL, fetch = FetchType.LAZY, mappedBy = "field", orphanRemoval = true)
public Set<Field> getFields()
{
return this.fields;
}
}
my issue is that my service to get all Articles takes a lot of time , because each Article object has a list with 200 Fields objects, this is my code :
//this service toke a lot of time, beacause it load the Object Article and it list of Fields objects
listOfArticles = service.getArticles();
//loop through listOfArticles to construct a map of fields from the list
for (Article article: listOfArticles) {
//this service construct a map of fields for each Article
Map<String, String> mapFields = service.constructMap(article);
//...some code
}
my idea is , in the entity Article ,i want to destroy the association with the property fields (remove the property fields), and load all the Fields objects (from the database) in a big Map (the map may contains 1M Objects) when the application startup
then inside my loop i will read the list of fields directly from the big Map insitead of the databse.
is this will do the trick for me and i can reduce the time of response?
is my idea a good solution to improve the performance?
Thanks in advance.
Load the entire table is never a good idea, I will suggest some points to improve your performance.
Create a pagination for your database results
If possible, load just the the Article and show the properties in the list or somewhere you are presenting
Just when you go to an article, load the details (fields) from this specific article
Put the fields you have loaded in a map and when you try to access an article, verify first if you already have in the memory, if not then you go to the database.
Try to use the lazy load every time you can to improve your system performance. Remember that doing this you will probably improve your performance but in the other hand you are using more memory, maybe you could consider just use the 3 first points listed.
A controller in my spring mvc app is giving an empty concepts collection for a DrugWord entity when there are DrugConcepts in the database for every DrugWord. How can I change my code so that it populates the concepts collection with the appropriate number of DrugConcept instances for each DrugWord instance?
Here is the JPA code that queries the database:
#SuppressWarnings("unchecked")
public DrugWord findDrugWord(String wrd) {
System.out.println("..... wrd is: "+wrd);
return (DrugWord) em.find(DrugWord.class, wrd);
}
Here is the code for the relevant controller method, which prints out 0 for the size of sel_word.getConcepts().size() when the size should be at least 1:
#RequestMapping(value = "/medications", method = RequestMethod.GET)
public String processFindForm(#RequestParam(value="wordId", required=false) String word, Patient patient, BindingResult result, Map<String, Object> model) {
Collection<DrugWord> results = this.clinicService.findDrugWordByName("");
System.out.println("........... word is: "+word);
if(word==null){word="abacavir";}
model.put("words", results);
DrugWord sel_word = this.clinicService.findDrugWord(word);
System.out.println(";;;; sel_word.concepts.size(), sel_word.getName() are: "+sel_word.getConcepts().size()+", "+sel_word.getName());
model.put("sel_word", sel_word);
return "medications/medsList";
}
Is the problem that I only have GET programmed? Would the problem be solved if I had a PUT method? If so, what would the PUT method need to look like?
NOTE: To keep this posting brief, I have uploaded some relevant code to a file sharing site. You can view the code by clicking on the following links:
The code for the DrugWord entity is at this link.
The code for the DrugConcept entity is at this link.
The code for the DrugAtom entity is at this link.
The code to create the underlying data tables in MySQL is at this link.
The code to populate the underlying data tables is at this link.
The data for one of the tables is at this link.
Some representative data from a second table is at this link.(This is just 10,000 records from the table, which has perhaps 100,000 rows.)
The data for the third table is at this link. (This is a big file, may take a few moments to load.)
The persistence xml file can be read at this link.
To help people visualize the underlying data, I am including a print screen of the top 2 results of queries showing data in the underlying tables as follows:
The problem seemed to be that the DB was corrupt, specifically that you had new lines characters in every word, so that the queries always returned an empty result. Besides there were some problems that very big graphs of entities were loaded from DB, triggering a lot of SQL queries.
First of all you can change the findDrugWord method to be like:
public DrugWord findDrugWord(String wrd) {
em.find(DrugWord.class, wrd);
}
Because word is the PK and you've already set fetching when you put #ManyToMany there. I can imagine that the duplicate fetch definition confuses your JPA provider, but it won't help it that's for sure. :)
Secondly, take a look at this line:
PropertyComparator.sort(sortedConcepts, new MutableSortDefinition("concept", true, true));
I can't see concept attribute in your DrugConcept entity. Didn't you want to write rxcui?
But if you really want to have it sorted every time, add #OrderBy("rxcui ASC").
I wouldn't make any sort in place of an Entity's Collection. Especially without properly overridden hashCode and equals: You can't be sure how Spring sorts your Collection with reflection in the background which can lead to lot of headaches.
Hope this helps ;)
I have a TestDTO class which holds the 2 input data from user,
next step is to fetch the several data from database, lets say i am fetching ten String type values from database which requires further to execute the business logic.
I wanted to know the best way to hold the data (in terms of saving memory space and performance)
Add 10 more fields in the existing TestDTO class and set database values at run time
Use java.util.collection (List/Map/..)
Create another DTO/Bean class for 10 String values
If you want modularity of your code 3rd point is better, but for simplicity you should use a HashMap, like:
HashMap map = new HashMap();
map.put("string1",value);
.....
and so on.
This post can be useful for you : https://forums.oracle.com/thread/1153857
If TestDTO and the new values fetched are coming from the same table in the database, then they should be in the same class. Else, the new values should ideally be in another DTO. I do not know the exact scenario that you have, so given these constraints, 2nd option goes out of the window. And options 1 and 3 will depend on your scenario. Always hold values from a single table in one object(preferably).
I need to query database for different combination of elements from the already received result object.
For instance, I get a list of Person entities. For each person in Person entities, I need to get List of address (for each person).
There are two ways to do it:
Iterate the Person entity and fire a query for each Person entity to get the list of Addresses for that person.
Build a query dynamically with elements from Person entity and fire ONE single query to pull all addresses lists for all Persons and then iterate the Person entity again and match the Address list for each Person.
I don't know much many Person entities I might get. So what is the better approach in terms of performance and practice.
So, if I have 100 Person entities, in the first approach its going to be 100 queries vs 2nd approach with huge query like below
from address where (person.id = 1 and person.zip = 393)
or (person.id = 2 and person.zip = 123)
or (person.id = 3 and person.zip = 345)
.... // 10 times.
Which one is better? Any restrictions / limitation on Or conditions in Oracle?
Is there a better approach? Batch queries?
You can use hibernate with eager loading to directly get the results what you want by loading the person with the restrictions required. Or else if you want to stick to lazy loading try using an inner join with person and Address so that you can then get a list of array which consist the results