Use pagination on query on multiple indices in hibernate search

Use pagination on query on multiple indices in hibernate search - java

we are implementing a global search that we can use to query all entities in our application, let's say cars, books and movies. Those entities do not share common fields, cars have a manufacturer, books have an author and movies have a director, for example.
In a global search field I'd like to search for all my entities with one query to be able to use pagination. Two aproaches come to my mind when thinking about how to solve this:
Query one index after another and manually merge the result. This means that I have to implement pagination myself.
Add common fields to each item like name and creator (or create an interface as shown here Single return type in Hibernate Search).In this case I can only search for fields in my global search that I map to the common fields.
My question is now: Is there a third (better) way? How would you suggest to implement such a global search on multiple indices?

Query one index after another and manually merge the result. This means that I have to implement pagination myself.
I definitely wouldn't do that, as this will perform very poorly, especially with deep pagination (page 40, etc.).
Add common fields to each item like name and creator (or create an interface as shown here Single return type in Hibernate Search).In this case I can only search for fields in my global search that I map to the common fields.
That's the way. You don't even need a common interface since you can just target multiple fields in the same predicate. The common interface would only help to target all relevant types: you can call .search(MyInterface.class) instead of .search(Arrays.asList(Car.class, Book.class, Movie.class)).
You can still apply predicates to fields that are specific to each type; it's just that fields that appear in more than one type must be consistent (same type, etc.). Also, obviously, if you require that the "manufacturer" (and no other field) matches "james", Books and Movies won't match anymore, since they don't have a manufacturer.
Have you tried it? For example, this should work just fine as long as manufacturer, author and director are all text fields with the same analyzer:
SearchResult<Object> result = searchSession.search( Arrays.asList(
Car.class, Book.class, Movie.class
) )
.where( f -> f.simpleQueryString()
.fields( "manufacturer", "author", "director" )
.matching( "james" ) )
.fetch( 20 );
List<Object> hits = result.hits(); // Result is a mix of Car, Book and Movie.

One approach would be to create a SQL view (SearchEntry?) that combines all of the tables you want to search. This allows you to alias your different column names. It won't be very good for performance but you could also just create one big field that is a concatenation of different searchable fields. Finally, include a "type" field that you tie back to your entity.
Now you can query everything in one go and use the type/id to tie back to a specific entity that the "search" data was initially pulled from.

Related

DynamoDB query on sub field of JSON Object

Is it possible to search for a subfield of a json object in dynamoDB table?
My table:
Item: "item name",
Location: {...},
ItemInformation :
{
ItemName: "itemName",
ProductLine: {
Brand: "Razer",
ManufacturerSource: "Razer"
}
Originally in this table ItemInformation would be a key and searching for an object we would construct the json for the item information and then query with the json string as a key.. Now we need to implement searching by sub fields of that object, which can contain different fields each time, i.e. isDigital: "true".
I notice in the question:
DynamoDB advanced scan - JAVA
The answer would seem to be no and I would have to separate out the fields. But I am curious about why and how the PHP library can query for sub fields on a JSON object in dynamoDB. Is there really no better solution then to store the column as separate fields and then add an index on all fields?

After looking through documentation it is not feasible to implement the search fields as I originally intended. The problem is that while the values are JSON they are stored as string literals so I have to do refactoring to start storing as JSON objects. Additionally I cannot add in columns and index because the search could operate on any number of fields and different items can have different fields, i.e. an Item can have Brand, BatteryInformation, Name. Given that the requirement is that any of these subfields should be searchable its better to do this in Cloud Search or ElasticSearch where I can index and search on arbitrary fields and values within a column of an object.
Since this is a DynamoDB table, I am going to use CloudSearch since it offers easier indexing option and integration for data.

At the moment, the only solution available is to store the column as separate fields. Probably, aws may come up with a solution in future releases.

DDD valueObject and database schema

To end 2014 year I got a simple question I think.
I would like to use "DDD" a bit more, and I'm currently trying to experiment various usecases to learn more about DDD.
My current usecase is the following :
we have a new database schema that is using a classic pattern in our company : modeling our nomenclature table as "id / code / label". I think it's a pretty classic case when using hibernate for example.
But in the OO world things get "complciated" for something this simple when using a API like JDBC or QueryDSL. I need to fetch an object by its code, retrieve its id or load the full object and then set it as a one to one relation in another object.
I wondering :
this kind of nomenclature can be an enum (or a class with String cosnatnts depending on the developer). in DDD terms, it is my ValueObject
the id /code / label in the database is not i18n friendly (it's not a prerequisite) so I don't see its advantages. Except when the table can be updated dynamically and the usecase is "pick something in a combobox loaded from this table and build a relation with another object : but that's all because if you have business rules that must be applied you need to know the new code etc etc).
My questions are :
do you often use the id / ocde / label pattern in your database model.
how do your model your nomenclature data ? (country is perhaps not the best example :) but no matter what how do you model it ? without thinking much I would say database table for country; but for some status : "valid, waiting validation, rejected" ?
do you model your valueObjects using this pattern ?
or do you use lots of enum and only store their toString (or ordinal) in the database ?
In the Java OO objects world, I'm currently thinking that it is easier to manipulate enum that objects loaded from the database. I need to build repositories to load them for example. And it will be so simple to use them as enums. I'm searching some recomfort here or perhaps am I missing something so obvious ?
thanks
see you in 2015 !
Update 1 :
We can create a "Budget" and the first one is mark as Initial and the next ones are marked as "Corrective" (with a increment). For example, we can have a list of Budgets :"Initial Budget", "Corrective budget #1", "Corrective budget #2".
For this we have this database design : a Budget Table, a Version Budge with a foreign key between the two. the Version budget only contains an ID, a CODE and a LABEL.
Personnaly, I would like to remove this table. I don't see the advantages of this structure. And from the OO perspective, when I'm creating a budget I can query the databse to see if I need to create an Inital or Corrective budget (using a count query) then I can set the right enum to my new budget. But with the current design I need to query the database using the CODE that I want, select the ID and set the ID. So yes, it's really database oriented. Where is the DDD part ? a ValueObject is something that describe, quantify something. In my case seems good to me. A Version describe the current status of my Budget. I can comapre two versions just but checking their code, they don't have lifecycle (I don't want this one in particular).
How to you handle this type of usecases ?
It's only a simple example because I found that if you ask a database admin he would surely said that all seems good : using primary key, modeling relations, enforing constraints, using foreign key and avoid data duplication.
Thanks again Mike and Doctor for their comments.

I will hook in in your country example. In most cases, country will be a value object. There is nothing that will reference a country entity and that should know that if the values of the country changes it is still the same country. In fact, the country could be represented as an enum, and some nasty resource lookup functions that translate the Iso3 into a usefull display text. What we do is, we define it as a value object class with iso3, displayname and some other static information. Now out of this value object we define a kind of "power enum" (I still miss a standard term here). The class implementing the country value object gets a private constructor and static properties for each of its values (for each country) and explicit cast operators from and to int. Now you can treat it just like a normal enum of your programing language. The advantage to a normal enum beside having more property fields is, that it also can have methods (of course query methods, that don't change the state of the object). You can even use polymorphism (some countries with different behaviour than others). You could also load the content of the enums from a database table (without the statics then and a static lookupByIso3 method instead).
This you could make with some other "enum like" value objects, too. Imagine Currencies (it could have conversion methods that are implemented polymorphic). The handling of the daily exchange rates is a different topic though.
If the set of values is not fixed (for example another value object candidate like postal adress) then it is not a value object enum, but a standard value object that could be instantiated with the values you want.
To decide if you can live with something as a value object, you can use the following question: Do you want copy semantic, or reference semantic? If you ever change a property of the object, should all places where you used it update, too, or should they stay as they are? If the latter, than the "changed" object is a new and different value object. Another question would be, if you need to track changes to an object realizing that it remains the "same" despite of changing values. And if you have a value object, where you only want specific instances to exist, it is a kind of enum described above.
Does that somehow help you?

Best practice design pattern for defining "types" in a database with potential multi language requirement?

My question more specificity is this:
I want users on multiple front ends to see the "Type" of a database row. Let's say for ease that I have a person table and the types can be Student, Teacher, Parent etc.
The specific program would be java with hibernate, however I doubt that's important for the question, but let's say my data is modelled in to Entity beans and a Person "type" field is an enum that contains my 3 options, ideally I want my Person object to have a getType() method that my front end can use to display the type, and also I need a way for my front end to know the potential types.
With the enum method I have this functionality but what I don't have is the ability to easily add new types without re-compiling.
So next thought is that I put my types in to a config file and simply story them in the database as strings. my getType() method works, but now my front end has to load a config file to get the potential types AND now there's nothing to keep them in sync, I could remove a type from my config file and the type in the database would point to nothing. I don't like this either.
Final thought is that I create a PersonTypes database table, this table has a number for type_id and a string defining the type. This is OK, and if the foreign key is set up I can't delete types that I'm using, my front end will need to get sight of potential types, I guess the best way is to provide a service that will use the hibernate layer to do this.
The problem with this method is that my types are all in English in the database, and I want my application to support multiple languages (eventually) so I need some sort of properties file to store the labels for the types. so do I have a PersonType table the purely contains integers and then a properties file that describes the label per integer? That seems backwards?
Is there a common design pattern to achieve this kind of behaviour? Or can anyone suggest a good way to do this?
Regards,
Glen x

I would go with the last approach that you have described. Having the type information in separate table should be good enought and it will let you use all the benefits of SQL for managing additional constraints (types will be probably Unique and foreign keys checks will assure you that you won't introduce any misbehaviour while you delete some records).
When each type will have i18n value defined in property files, then you are safe. If the type is removed - this value will not be used. If you want, you can change properties files as runtime.
The last approach I can think of would be to store i18n strings along with type information in PersonType. This is acceptable for small amount of languages, altough might be concidered an antipattern. But it would allow you having such method:
public String getName(PersonType type, Locale loc) {
if (loc.equals(Locale.EN)) {
return type.getEnglishName();
} else if (loc.equals(Locale.DE)){
return type.getGermanName();
} else {
return type.getDefaultName();
}
}

Internationalizing dynamic values is always difficult. Your last method for storing the types is the right one.
If you want to be able to i18n them, you can use resource bundles as properties files in your app. This forces you to modify the properties files and redeploy and restart the app each time a new type is added. You can also fall back to the English string stored in database if the type is not found in the resource bundle.
Or you can implement a custom ResourceBundle class that fetches its keys and values from the database directly, and have an additional PersonTypeI18n table which contains the translations for all the locales you want to support.

You can use following practices:
Use singleton design pattern
Use cashing framework such as EhCashe for cashe type of person and reload when need.

Extending JPA entity data at runtime

I need to allow client users to extend the data contained by a JPA entity at runtime. In other words I need to add a virtual column to the entity table at runtime. This virtual column will only be applicable to certain data rows and there could possibly be quite a few of these virtual columns. As such I don't want to create an actual additional column in the database, but rather I want to make use of additional entities that represent these virtual columns.
As an example, consider the following situation. I have a Company entity which has a field labelled Owner, which contains a reference to the Owner of the Company. At runtime a client user decides that all Companies that belong to a specific Owner should have the extra field labelled ContactDetails.
My preliminary design uses two additional entities to accomplish this. The first basically represents the virtual column and contains information such as the field name and type of value expected. The other represents the actual data and connects an entity row to a virtual column. For example, the first entity might contain the data "ContactDetails" while the second entity contains say "555-5555."
Is this the right way to go about doing this? Is there a better alternative? Also, what would be the easiest way to automatically load this data when the original entity is loaded? I want my DAO call to return the entity together with its extensions.
EDIT: I changed the example from a field labelled Type which could be a Partner or a Customer to the present version as it was confusing.

Perhaps a simpler alternative could be to add a CLOB column to each Company and store the extensions as an XML. There is a different set of tradeoffs here compared to your solution but as long as the extra data doesn't need to be SQL accessible (no indexes, fkeys and so on) it will probably be simple than what you do now.
It also means that if you have some fancy logic regarding the extra data you would need to implement it differently. For example if you need a list of all possible extension types you would have to maintain it separately. Or if you need searching capabilities (find customer by phone number) you will require lucene or similar solution.
I can elaborate more if you are interested.
EDIT:
To enable searching you would want something like lucene which is a great engine for doing free text search on arbitrary data. There is also hibernate-search which integrates lucene directly with hibernate using annotations and such - I haven't used it but I heard good things about it.
For fetching/writing/accessing data you are basically dealing with XML so any XML technique should apply. The best approach really depends on the actual content and how it is going to be used. I would suggest looking into XPath for data access, and maybe look into defining your own hibernate usertype so that all the access is encapsulated into a class and not just plain String.

I've run into more problems than I hoped I would and as such I decided to dumb down the requirements for my first iteration. I'm currently trying to allow such Extensions only on the entire Company entity, in other words, I'm dropping the whole Owner requirement. So the problem could be rephrased as "How can I add virtual columns (entries in another entity that act like an additional column) to an entity at runtime?"
My current implementation is as follow (irrelevant parts filtered out):
#Entity
class Company {
// The set of Extension definitions, for example "Location"
#Transient
public Set<Extension> getExtensions { .. }
// The actual entry, for example "Atlanta"
#OneToMany(fetch = FetchType.EAGER)
#JoinColumn(name = "companyId")
public Set<ExtensionEntry> getExtensionEntries { .. }
}
#Entity
class Extension {
public String getLabel() { .. }
public ValueType getValueType() { .. } // String, Boolean, Date, etc.
}
#Entity
class ExtensionEntry {
#ManyToOne(fetch = FetchType.EAGER)
#JoinColumn(name = "extensionId")
public Extension getExtension() { .. }
#ManyToOne(fetch = FetchType.LAZY)
#JoinColumn(name = "companyId", insertable = false, updatable = false)
public Company getCompany() { .. }
public String getValueAsString() { .. }
}
The implementation as is allows me to load a Company entity and Hibernate will ensure that all its ExtensionEntries are also loaded and that I can access the Extensions corresponding to those ExtensionEntries. In other words, if I wanted to, for example, display this additional information on a web page, I could access all of the required information as follow:
Company company = findCompany();
for (ExtensionEntry extensionEntry : company.getExtensionEntries()) {
String label = extensionEntry.getExtension().getLabel();
String value = extensionEntry.getValueAsString();
}
There are a number of problems with this, however. Firstly, when using FetchType.EAGER with an #OneToMany, Hibernate uses an outer join and as such will return duplicate Companies (one for each ExtensionEntry). This can be solved by using Criteria.DISTINCT_ROOT_ENTITY, but that in turn will cause errors in my pagination and as such is an unacceptable answer. The alternative is to change the FetchType to LAZY, but that means that I will always "manually" have to load ExtensionEntries. As far as I understand, if, for example, I loaded a List of 100 Companies, I'd have to loop over and query each of those, generating a 100 SQL statements which isn't acceptable performance-wise.
The other problem which I have is that ideally I'd like to load all the Extensions whenever a Company is loaded. With that I mean that I'd like that #Transient getter named getExtensions() to return all the Extensions for any Company. The problem here is that there is no foreign key relation between Company and Extension, as Extension isn't applicable to any single Company instance, but rather to all of them. Currently I can get past that with code like I present below, but this will not work when accessing referenced entities (if for example I have an entity Employee which has a reference to Company, the Company which I retrieve through employee.getCompany() won't have the Extensions loaded):
List<Company> companies = findAllCompanies();
List<Extension> extensions = findAllExtensions();
for (Company company : companies) {
// Extensions are the same for all Companies, but I need them client side
company.setExtensions(extensions);
}
So that's were I'm at currently, and I have no idea how to proceed in order to get past these problems. I'm thinking that my entire design might be flawed, but I'm unsure of how else to try and approach it.
Any and all ideas and suggestions are welcome!

The example with Company, Partner, and Customer is actually good application for polymorphism which is supported by means of inheritance with JPA: you will have one the following 3 strategies to choose from: single table, table per class, and joined. Your description sounds more like joined strategy but not necessarily.
You may also consider just one-to-one( or zero) relationship instead. Then you will need to have such relationship for each value of your virtual column since its values represent different entities. Hence, you'll have a relationship with Partner entity and another relationship with Customer entity and either, both or none can be null.

Use pattern decorator and hide your entity inside decoratorClass bye

Using EAV pattern is IMHO bad choice, because of performance problems and problems with reporting (many joins). Digging for solution I've found something else here: http://www.infoq.com/articles/hibernate-custom-fields

Does Hibernate's Criteria API still not support nested relations

I'd like to use Hibernate's Criteria API for precisely what everybody says is probably its most likely use case, applying complex search criteria. Problem is, the table that I want to query against is not composed entirely of primitive values, but partially from other objects, and I need to query against those object's id's.
I found this article from 2 years ago that suggests it's not possible. Here's how I tried it to no avail, there are other aspect of Hibernate where I know of where this sort of dot notation is supported within string literals to indicate object nesting.
if (!lookupBean.getCompanyInput().equals("")) {
criteria.add(Restrictions.like("company.company", lookupBean.getCompanyInput() + "%"));
}
EDIT:
Here's my correctly factored code for accomplishing what I was trying above, using the suggestion from the first answer below; note that I am even using an additional createCriteria call to order on an attribute in yet another associated object/table:
if (!lookupBean.getCompanyValue().equals("")) {
criteria.createCriteria("company").add(
Restrictions.like("company", lookupBean.getCompanyValue() + "%"));
}
List<TrailerDetail> tdList =
criteria.createCriteria("location").addOrder(Order.asc("location")).list();

Not entirely sure I follow your example, but it's certainly possible to specify filter conditions on an associated entity, simply by nesting Criteria objects to form a tree. For example, if I have an entity called Order with a many-to-one relationship to a User entity, I can find all orders for a user named Fred with a query like this:
List<Order> orders = session.createCriteria(Order.class)
.createCriteria("user")
.add(eq("name", "fred"))
.list();
If you're talking about an entity that has a relationship to itself, that should work as well. You can also replace "name" with "id" if you need to filter on the ID of an associated object.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.