empty collection when it should be populated

empty collection when it should be populated - java

A controller in my spring mvc app is giving an empty concepts collection for a DrugWord entity when there are DrugConcepts in the database for every DrugWord. How can I change my code so that it populates the concepts collection with the appropriate number of DrugConcept instances for each DrugWord instance?
Here is the JPA code that queries the database:
#SuppressWarnings("unchecked")
public DrugWord findDrugWord(String wrd) {
System.out.println("..... wrd is: "+wrd);
return (DrugWord) em.find(DrugWord.class, wrd);
}
Here is the code for the relevant controller method, which prints out 0 for the size of sel_word.getConcepts().size() when the size should be at least 1:
#RequestMapping(value = "/medications", method = RequestMethod.GET)
public String processFindForm(#RequestParam(value="wordId", required=false) String word, Patient patient, BindingResult result, Map<String, Object> model) {
Collection<DrugWord> results = this.clinicService.findDrugWordByName("");
System.out.println("........... word is: "+word);
if(word==null){word="abacavir";}
model.put("words", results);
DrugWord sel_word = this.clinicService.findDrugWord(word);
System.out.println(";;;; sel_word.concepts.size(), sel_word.getName() are: "+sel_word.getConcepts().size()+", "+sel_word.getName());
model.put("sel_word", sel_word);
return "medications/medsList";
}
Is the problem that I only have GET programmed? Would the problem be solved if I had a PUT method? If so, what would the PUT method need to look like?
NOTE: To keep this posting brief, I have uploaded some relevant code to a file sharing site. You can view the code by clicking on the following links:
The code for the DrugWord entity is at this link.
The code for the DrugConcept entity is at this link.
The code for the DrugAtom entity is at this link.
The code to create the underlying data tables in MySQL is at this link.
The code to populate the underlying data tables is at this link.
The data for one of the tables is at this link.
Some representative data from a second table is at this link.(This is just 10,000 records from the table, which has perhaps 100,000 rows.)
The data for the third table is at this link. (This is a big file, may take a few moments to load.)
The persistence xml file can be read at this link.
To help people visualize the underlying data, I am including a print screen of the top 2 results of queries showing data in the underlying tables as follows:

The problem seemed to be that the DB was corrupt, specifically that you had new lines characters in every word, so that the queries always returned an empty result. Besides there were some problems that very big graphs of entities were loaded from DB, triggering a lot of SQL queries.

First of all you can change the findDrugWord method to be like:
public DrugWord findDrugWord(String wrd) {
em.find(DrugWord.class, wrd);
}
Because word is the PK and you've already set fetching when you put #ManyToMany there. I can imagine that the duplicate fetch definition confuses your JPA provider, but it won't help it that's for sure. :)
Secondly, take a look at this line:
PropertyComparator.sort(sortedConcepts, new MutableSortDefinition("concept", true, true));
I can't see concept attribute in your DrugConcept entity. Didn't you want to write rxcui?
But if you really want to have it sorted every time, add #OrderBy("rxcui ASC").
I wouldn't make any sort in place of an Entity's Collection. Especially without properly overridden hashCode and equals: You can't be sure how Spring sorts your Collection with reflection in the background which can lead to lot of headaches.
Hope this helps ;)

Related

How to make a Hibernate SearchSession return results with unique attributes?

I am working on using the Hibernate SearchSession class in Java to perform a search against a database, the code I currently have to search a table looks something like this:
SearchSession searchSession = Search.session(entityManagerFactory.unwrap(SessionFactory.class).withOptions()
.tenantIdentifier("locations").openSession());
SearchResult<Location> result = searchSession.search(Location.class)
.where( f -> f.bool()
.must( f.match()
.field("locationName")
.matching((phrase)).fuzzy())
).fetch(page * limit, limit);
This search works and properly returns results from the database, but there is no uniqueness constraint on the locationName column and the database holds multiple records with the same value in locationName. As a result, when we try to display them on the UI of the application it looks like there are duplicate values, even though they're unique in the database.
Is there a way to make a SearchSession only return a result if another result with an identical value (such as locationName) has not been returned before? Applying a uniqueness constraint to the database table isn't an option in this scenario, and we were hoping there's a way to handle filtering out duplicate values in the session over taking the results from the search and removing duplicate values separately.

Is there a way to make a SearchSession only return a result if another result with an identical value (such as locationName) has not been returned before?
Not really, at least not at the moment.
If you're using the Elasticsearch backend and are fine with going native, you can insert native JSON into the Elasticsearch request, in particular collapsing.
I think something like this might work:
SearchResult<Location> result = searchSession.search( Location.class )
.extension( ElasticsearchExtension.get() )
.where( f -> f.bool()
.must( f.match()
.field("locationName")
.matching((phrase)).fuzzy())
)
.requestTransformer( context -> {
JsonObject collapse = new JsonObject();
collapse.addProperty("field", "locationName_keyword")
JsonObject body = context.body();
body.add( "collapse", collapse );
} )
// You probably need a sort, as well:
.sort(f -> f.field("id"))
.fetch( page * limit, limit );
You will need to add a locationName_keyword field to your Location entity:
#Indexed
#Entity
public class Location {
// ...
#Id
#GenericField(sortable = Sortable.YES) // Add this
private Long id;
// ...
#FullTextField
#KeywordField(name = "locationName_keyword", sortable = Sortable.YES) // Add this
private String locationName;
// ...
}
(You may need to also assign a custom normalizer to the locationName_keyword field, if the duplicate locations have a slightly different locationName (different case, ...))
Note however that the "total hit count" in the Search result will indicate the number of hits before collapsing. So if there's only one matching locationName, but 5 Location instances with that name, the total hit count will be 5, but users will only see one hit. They'll be confused for sure.
That being said, it might be worth having another look at your situation to determine whether collapsing is really necessary here:
As a result, when we try to display them on the UI of the application it looks like there are duplicate values, even though they're unique in the database.
If you have multiple documents with the same locationName, then surely you have multiple rows in the database with the same locationName? Duplication doesn't appear spontaneously when indexing.
I would say the first thing to do would be to step back, and consider whether you really want to query the Location entity, or if another, related entity wouldn't make more sense. When two locations have the same name, do they have a relationship to another, common entity instance (e.g. of type Shop, ...)?
=> If so, you should probably query that entity type instead (.search(Shop.class)), and take advantage of #IndexedEmbedded to allow filtering based on Location properties (i.e. add #IndexedEmbedded to the location association in the Shop entity type, then use the field location.locationName when adding a predicate that should match the location name).
If there is no such related, common entity instance, then I would try to find out why locations are duplicated exactly, and more importantly why that duplication makes sense in the database, but not to users:
Are the users not interested in all the locations? Then maybe you should add another filter to your query (by "type", ...) that would help remove duplicates. If necessary, you could even run multiple search queries: first one with very strict filters, and if there are no hits, fall back to another one with less strict filters.
Are you using some kind of versioning or soft deletion? Then maybe you should avoid indexing soft-deleted entities or older versions; you can do that with conditional indexing or, if that doesn't work, with a filter in your search query.
If your data really is duplicated (legacy database, ...) without any way to pick a duplicate over another except by "just picking the first one", you could consider whether you need an aggregation instead of full-blown search. Are you just looking for the top location names, or maybe a count of locations by name? Then aggregations are the right tool.

Implementation of Room queries in Android Room

As the documentation regarding these topics seems to be limited (and I was searching a lot - either wrong or the documentation is really limited), I would like to place the question here.
So far I could only find documentation that shows how to implement the basic CRUD operations (insert, delete, update, deleteAll, getAll) in the Android Architecture Components, but never queries which only return a single item. In general, the question is: Is the idea to hold all information in the repository by holding LiveData of all table contents and returning single objects from the repository?
Let me precise the question in two cases:
One table / entity
Two tables / entities with a 1:many relationship
One table
I fully get the concept of using LiveData for a single table, which I can use in a recycler view to list all table rows. But in many cases, I only need one row/object to work with. For example when editing one item.
Question 1: Is it common to implement a method on the repository to get the needed item out of the LiveData<List> allObjects like below? Of course, I could also pass all information from the last activity to my editActivity through my intent, but I find it easier to implement if I just pass the ID of an object and load it in my editActivity.
private LiveData<List<object>> allObjects;
public void getObjectById(int id){
for (Object o : allObjects) {
if(o.getid() == id){
return o;
}
}
Two tables
I also get the idea of having two entities and defining their relation in a separate class. Let's use a common example from the documentation: school with students (1:m).
Question 2: Is it common to hold LiveData<List> in my repository?
In my recycler view, I could use this list to display (for example) all schools with their number of students. Therefore I might not need the LiveData allSchools anymore.
Question 3: What is the best way to implement a query which returns me a student and the school he is visiting? I could implement a (relationship) class StudentWithSchool and keep LiveData of it in my repository.
private LiveData<List<StudentWithSchool>> allStudentsWithSchool;
public void getStudentWithSchoolByStudentId(int id){
for (StudentWithSchool s : allStudentsWithSchool) {
if(s.getid() == id){
return s;
}
}
It would be really helpful if somebody can explain to me how to implement the above examples correctly. Thank you, guys!

Bean class Vs Collection : which one should i prefer to hold data

I have a TestDTO class which holds the 2 input data from user,
next step is to fetch the several data from database, lets say i am fetching ten String type values from database which requires further to execute the business logic.
I wanted to know the best way to hold the data (in terms of saving memory space and performance)
Add 10 more fields in the existing TestDTO class and set database values at run time
Use java.util.collection (List/Map/..)
Create another DTO/Bean class for 10 String values

If you want modularity of your code 3rd point is better, but for simplicity you should use a HashMap, like:
HashMap map = new HashMap();
map.put("string1",value);
.....
and so on.
This post can be useful for you : https://forums.oracle.com/thread/1153857

If TestDTO and the new values fetched are coming from the same table in the database, then they should be in the same class. Else, the new values should ideally be in another DTO. I do not know the exact scenario that you have, so given these constraints, 2nd option goes out of the window. And options 1 and 3 will depend on your scenario. Always hold values from a single table in one object(preferably).

How to best return the SQL results in pages/parts

I need help in the following design. I expose a web method that does an SQL select from a database. The problem is that the number of records can be huge and I don't want to return all of the records in a single call.
So I can think of these options (to return the result in pages):
1) Provide a method with parameters so that the client requests recordStart and recordEnd each time.
2) Modify the method to accept a resultset of size X and somehow understand that each request is not a new one but return the next X records. To figure this out somekind of a token could be associated per client but the problem is I am not sure how long should this token be kept and then disposed of so as to treat an incoming request either as a first request or a continuation of a previous one.
So which design should I go for and how would I solve any relevant problems I mention?
Are there better ways to deal with these problems?

From my point of view:
each request may be encapsulated in a request object that basically will hold an offset and a page size;
each response may be encapsulated in a response object that basically will have a result list and a total, alternatively you could hold the request object used to build the response;
Your interface to perform the selection on database will be similar to:
public PageResponse getPage(PageRequest pageRequest);
This approach turns easy the extension of your paging method, imagine in a few months you need to implement a sort to that method, you will need to change each invocation to it. With this approach you change the PageRequest object and gives a default sort, nothing will be broken and you can customize sort just in the invocation that really needs it.
Within this method you will need two different DataBase selections:
one in order to retrieve the selection list (the one that will be hold by the response and accessed through the property resultList), this can be done using feature specific for each database to limit your result set (top for sybase, limit for mysql and PG, rownum with Oracle, this will vary from one database to other);
another one to get the total of selected records without paging, in order to perform paging of your data in case of big data sets.
A good reference for your problem would be Spring Data, they have Page and PageRequest that is more or less what you need. Maybe you could use their API to implement your solution.
Practically your request object could looks like:
public class PageRequest {
private int offset;
private int pageSize;
// getters and setters and convenience constructors with the given fields
}
public class PageResponse {
private List<?> resultList;
private int total;
// getters and setters and convenience constructors with the given fields
}
Of course you could play a bit with Generics too in order to have response holding types you already requested, facilitating use of the response object like:
public <T> PageResponse<T> getPage(PageRequest<T> pageRequest);
having the objects for Request and Response like:
public class PageRequest<T> {
private int offset;
private int pageSize;
// getters and setters and convenience constructors with the given fields
}
public class PageResponse<T> {
private List<T> resultList;
private int total;
// getters and setters and convenience constructors with the given fields
}

The better approach and the most widely used is1).
In this case there is no constraint between the sql procedure and the client code (just like programming according to interfaces in OOP).
It is also simple and less error-prone then to look at the hash where the token is, so you can fetch the current position. You will also have to remember this hash somewhere (memory for speed or disk for durability).
As you can see this method is complicated and if you think deeply about it, you must make a lot more decisions then your first method.
Also, the first method gives you more freedom in designing the application b/c it behaves like OOP encapsulation. For future, you could also use another piece of code that remembers the current position for you and the client will call this code and then this code will call the database passing the results to the client.
The bottom line, go with 1).

If you want to return the total number of results, i would go for a combination of both options :
New query (no identifier given) :
First do a row count of the full result set (select count(*) from ...) without the order by clause.
Here you could return an error if the resultset is really too big [optional]
Get the page data (ordering is mandatory, either defined by you or the user of your api)
Generate an identifier and persist it with the result count [and optionally the query]
Return page data, total result count and identifier of the query
Next queries (identifier given) :
Get the persisted result count
Get the page data (ordering is mandatory, either defined by you or the user of your api)
Return page data, total result count and identifier of the query
The best would be to persist the query (without paging parts) along with the identifier, this way you don't even have to create it another time with next calls.

Hibernate partial update (How to?)

I have a performance problem with a hibernate implementation that is far to performance costly.
I will try to explain my current implementation which must be improved upon with pseudo classes.
Let’s say I have the following POJO classes (the Entity classes are hibernate annotated "copies").
Country.java and CountryEntity.java
City.javaand CityEntity.java
Inhabitant.javaand InhabitantEntity.java
And I want to add a city to a country and save/persist it in the database, the new city arrives fully populated as a POJO.
Current code
CountryEntity countryEntity = CountryDao.fetch(someId);
Country country = CountryConverter(countryEnity);
country.getCities.add(newCity);
countryEnity = CountryEntityConverter(country);
CountryDao.save(countryEnity);
This results in a major performance problem. Let's say I have 200 cities with 10,000 inhabitants.
For me to add a new city the converter will convert 200 x 10,000 = 2,000,000 inhabitantEntity --> inhabitant --> inhabitantEntity
This puts a tremendous load on the server, as new cities are added often.
It also feels unnecessary to convert all cities in the country just to persist and connect another one.
I am thinking of creating a light converter which doesn't convert all the fields and only the ones I need for some business logic during the addition of the city, but those will be kept unchanged, I don't know if Hibernate is good enough to handle this scenario.
For example if I save an entity with alot of null fields and the list cities with only one city, can I tell hibernate to merge this together with the db.
Or is there a different approace I can take to solve the performance problem but keeping the POJO and Entitys separate?
Some code below showing my current "slow" implementation code.
Country.Java (pseudo code)
private fields
private List<City> cities;
City.Java (pseudo code)
private fields
private List<Inhabitant> inhabitants;
Inhabitant.Java (pseudo code)
private fields
Currently I fetch a CountryEnity thru a Dao java class.
Then I have converter classes (Entities --> POJO) that sets all fields and initiate all lists.
I also have similar converter classes converting (POJO --> Entities).
CountryConverter(countryEntity)
Country country = new Country();
Country.setField(countryEntity.getField())
Loop thru cityEnitites
Country.getCities.add(CityConverter(cityEntity))
return country
CityConverter(cityEntity)
City city = new City()
city.setField(cityEntity.getField())
Loop thru inhabitantEnitites
city.getInhabitants.add(InhabitantConverter(inhabitantEntity))
return country
InhabitantConverter(inhabitantEntity)
Inhabitant inhabitant = new Inhabitant()
inhabitant.setField(inhabitantEntity.getField())
return inhabitant
Thanks in advance /Farmor

I suspect what might be happening is that you don't have an index column on the association, so Hibernate is deleting and then inserting the child collection, as opposed to just adding to or deleting discrete objects to and from the child association.
If that is what's going on, you could try adding an #IndexColumn annotation to the get method for the child association. That will then allow Hibernate to perform discrete inserts, updates, and deletes on association records, as opposed to having to delete and then re-insert. You would then be able to insert the new city and its new inhabitants without having to rebuild everything.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

empty collection when it should be populated - java

The problem seemed to be that the DB was corrupt, specifically that you had new lines characters in every word, so that the queries always returned an empty result. Besides there were some problems that very big graphs of entities were loaded from DB, triggering a lot of SQL queries.

Related

How to make a Hibernate SearchSession return results with unique attributes?

Implementation of Room queries in Android Room

Bean class Vs Collection : which one should i prefer to hold data

How to best return the SQL results in pages/parts

Hibernate partial update (How to?)

Categories

Resources