I have a Product entity and table and would like the database design to allow finding a product by different keywords on top of its name, that is, like using a thesaurus e.g. product name "HDR-TD20V" should also be found by keywords "camcorder", "camera", "video camera", etc. Note that this same mechanics can be used to locate the same record from different input languages e.g. looking for "camara de video" (Spanish) or "videokamera" (German) should also find the same record.
Assuming that I am using Hibernate-search i.e. Lucene I have the following two design choices:
De-normalized approach: Product table has a keywords column that contain comma separated keywords for that product. This clearly violates the First Normal Form "... the value of each attribute contains only a single value from that domain.". However, this would integrate nicely with Hibernate-search.
Normalized approach: Define a Keyword entity table i.e. Keyword(id,keyword,languageId) and the many-to-many association ProductKeyword(productId,keywordId) but the integration with Hibernate-Search is not so intuitive anymore ... unless e.g. I create a materialized view i.e. select * from Product p, Keyword k, ProductKeyword pk where p.id=pk.productId and k.id=pk.keywordId and index this materialized view.
I would of course prefer the choice 2 but I am not sure how Hibernate-search would optimally cover this use-case.
Something like this should work:
#Indexed
public class Product {
#Id
private long id;
#ManyToMany
#IndexedEmbedded
Set<Keyword> keywords;
// ...
}
public class Keyword {
#Id
private long id;
// only needed if you want a bidirectional relation
#ManyToMany
#ContainedIn
Set<Product> products;
// ...
}
I am leaving out options for lazy loading etc. How exactly the JPA mapping looks like depends on the user case
Related
I'm designing a solution for dealing with complex structure (user related stuff with lots of relations) in a simplier and possibly more efficient way than getting all the related data from DB. The only part of data I really need in my use case is basically contained withing the non-relational 'main' entity fields.
As for now I extracted the basic fields from 'main' class (let it be class OldMain) to another class (let's call it abstract class Extracted), used #MappedSuperclass and created 2 classes that extends it- Basic (which is empty class as Extracted gives it all the data I need and mapped to table 'X') and Extended (which is also mapped to table 'X' but with all the extra relations). It basically works but the code structure looks odd and makes me thinking if that's a proper way of dealing with such a problem.
I also tried with lazy initiation on relational fields (which i guessed on the beginning would serve here well), but I wasn't able to get it to work as I wanted with Jackson (only non-lazy fields in JSON, without fetching lazy related data- it couldn't be serialized or fired another several dozen of relation queries).
Another thing i stumbled upon in some tutorial was making DTO from 'OldMain' entity to not touch the lazy fields but haven't tried it yet as I started with the #MappedSuperClass way.
#Table(name = "X")
#MappedSuperclass
public abstract class Extracted{
//all the non-relational fields from OldMain
#Id
#GeneratedValue(strategy = GenerationType.IDENTITY)
private Integer id;
private String name;
private String surname;
private String userName;
private String email;
}
#Table(name = "X")
#Entity
public class Basic extends Extracted{
//empty
}
#Table(name = "X")
#Entity
public class Extended extends Extracted{
//all relational fields from OldMain, no data fields
}
Also the general question is- is there any good practices when dealing with need of using only a subset of a bigger entity?
There is no obligation for a JPA Entity to map all existing columns in the corresponding table in the database. That is, given a table my_entity with columns col1, col2, col3, the Entity mapped to this table could map only col1 and col2 and ignore col3. That being said, plus the fact that you only need the non-relational attributes, you could directly use your Extracted class with the attributes you need and ignore the fact that other relational field exists. Furthermore, if all the relational fields are nullable then you could even be able to persist new instances of Extracted class. And Jackson would only (un)marshal the declared attributes in Extracted class.
In other case, I suggest to follow the approach you already are in and define new Entity classes that extend your Extracted class with the required attributes. I don't see how "code structure looks odd", other than having a Basic class with no new attributes than Extracted - you could easily make Extracted non-abstract and use it directly, and get rid of Basic.
I have following three entity classes.
#Entity
public class User {
#Id
#Column(nullable = false)
#GeneratedValue(strategy= GenerationType.AUTO)
private Integer id;
}
#Entity
public class LanguageProficiencyLevel {
#Id
#Column(nullable = false)
#GeneratedValue(strategy= GenerationType.AUTO)
private Integer id;
private String name; // A1, A2, B1 ... etc
}
#Entity
public class Language {
#Id
#Column(nullable = false)
#GeneratedValue(strategy= GenerationType.AUTO)
private Integer id;
private String name; //English, Chinese ect ...
}
Currently in the database, I have around 20 languages saved in Language table and 6 language proficiency levels A1, A2, B1, B2, C1, C2 saved in LanguageProficiencyLevel table.
Now I have the following relationship among the entity classes.
A User can know more than one languages with one proficiency level and A language with one proficiency level is known by many users.
So for example, A user may know English and his English proficiency may be C1, Again same user may also know Spanish and his Spanish proficiency may be B1.
Here I understand, User and Language has many to many relation. But I don't understand how to relate LanguageProficiencyLevel with User or Language.
Also how should I save this in database? My idea is to make one join table (LanguageSkill) with column names as user_id, language_id and languageProficiencyLevel_id and this table row will be inserted when a user is created. I am not sure if this the way to implement it. Please give me an idea how to do this and what should be the configuration for this.
User and Language will have many to many relation as you said. And, Language and LanguageProficiencyLevel will have one to one relation.
So, you have to create a mapping table that will have have many to one relation to all the 3 tables.
Refer this link to create mapping table with multiple columns.
Your relations between entity objects would be like:
A user can have multiple languages and language can have multiple users. So its a Many-to-Many relationship between User-Language.
A language can have only one proficiency at a time but a proficiency can have multiple languages. So LanguageProficiency to Language would be a one-to-many relation.
Relation between user and language proficiency is also many-to-many.
Here is a link how you can go about your database design for many-to-many relations.
How to implement a many-to-many relationship in PostgreSQL?
After creating the database design you can probably use some reverse engineering tool(https://www.javacodegeeks.com/2013/10/step-by-step-auto-code-generation-for-pojo-domain-java-classes-and-hbm-using-eclipse-hibernate-plugin.html) to create the hibernate pojo classes. I would recommend to use a tool to do this rather than taking things in hands to avoid unnecessary issues.
So now if you look carefully... your Entity classes of User, Language and LanguageProficiency would be something like this.
Hope this is useful.
You should absolutely create another table LanguageSkill, like you said.
Language and Proficiency are so-called base data - they will have comparatively few entries and will be independent of users. Neither of them should be mapped into User.
A User should then have a #OneToMany relation to LanguageSkill, which represents his knowledge of a particular language. LanguageSkill has a #ManyToOne to both Language and Proficiency (and User).
Skipping LanguageSkill would result in data duplication in your schema, or at least in a schema that is hard to read with all the jointables.
Also, it would mix concerns - data that is relatively stable (Language, Proficiency) and data that will change often (a person's knowledge of a language).
In my app I have different Users and Items, so each user can pick many items.
In the tutorial I have learned about #ManyToMany annotation.
#Entity
public class Item extends Model {
...
#ManyToMany(cascade = CascadeType.REMOVE)
public List<User> users = new ArrayList<User>();
But second option I can think of is to define a separate class for User-to-Item relation so I can add additional information like date and time.
#Entity
public class ItemUserRel extends Model {
#Id
public Long id;
public User user;
public Item item;
//additional information
public Date date;
...
Which of both options is better design and why?
I faced a similar issue a while ago. I also had to deal with a model User and the model Group. My requirements were:
A user can have n readable and n writable Groups. These permissions must be stored in a third table (not in User and not in Group table). But also additional properties like authorisedBy and 'authorisedOn'. So #ManyToMany did not worked for, because I had no real control of it. Also the additional properties makes it hard to map via JPA.
Perhaps other designs are possible but I (still) think that introducing a new class UserGroup would be best. This class has #ManyToOne relation to a single User.
I end up defining these three models:
User
Group - General information about the group model
UserGroup - Containing additional fields like: permissions, authorisedBy, authorisedOn etc.
On my User model, I would have getter getUserGroups() but also getPersonalGroup() which is basically one (personal) instance of Group in getUserGroups() but where the createdBy and authorisedBy is the same user.
I found this design much more maintainable by me and more clear. Also this design helped me to create a comfortable user interface, where the administrator can manage and change permissions for UserGroups.
Perhaps more useful information
Mapping many-to-many association table with extra column(s)
How Do I Create Many to Many Hibernate Mapping for Additional Property from the Join Table?
I'm currently coming (back) up to speed with EJB and while I was away it changed drastically (so far for the better). However, I've come across a concept that I am struggling with and would like to understand better as it seems to be used in our (where I work, not me and all the voices in my head) code quite a bit.
Here's the example I've found in a book. It's part of an example showing how to use the #EmbeddedId annotation:
#Entity
public class Employee implements java.io.Serializable
{
#EmbeddedId
#AttributeOverrides({
#AttributeOverride(name="lastName", column=#Column(name="LAST_NAME"),
#AttributeOverride(name="ssn", column=#Column(name="SSN"))
})
private EmbeddedEmployeePK pk;
...
}
The EmbeddedEmployeePK class is a fairly straightforward #Embeddable class that defines a pair of #Columns: lastName and ssn.
Oh, and I lifted this example from O'Reilly's Enterprise JavaBeans 3.1 by Rubinger & Burke.
Thanks in advance for any help you can give me.
It's saying that the attributes that make up the embedded id may have predefined (through explicit or implicit mappings) column names. By using the #AttributeOverride you're saying "ignore what other information you have with regard to what column it is stored in, and use the one I specify here".
In the UserDetails class, I have overridden homeAddress & officeAddress with Address
This One Address POJO will act for two table in DB.
DB:
Table1 Table2
STREET_NAME HOME_STREET_NAME
CITY_NAME HOME_CITY_NAME
STATE_NAME HOME_STATE_NAME
PIN_CODE HOME_PIN_CODE
The EmbeddedEmployeePK class is a fairly straightforward #Embeddable class that defines a pair of #Columns: lastName and ssn.
Not quite - EmbeddedEmployeePK defines a pair of properties, which are then mapped to columns. The #AttributeOverride annotations allow you to override the columns to which the embedded class's properties are mapped.
The use case for this is when the embeddable class is used in different situations where its column names differ, and some mechanism is required to let you change those column mappings. For example, say you have an entity which contains two separate instances of the same embeddable - they can't both map to the same column names.
JPA tries to map field names to column names in a datasource, so what you're seeing here is the translation between the name of a field variable to the name of a column in a database.
You can override also other column properties (not just names).
Let's assume that you want to change the length of SSN based on who is embedding your component. You can define an #AttributeOverride for the column like this:
#AttributeOverrides({
#AttributeOverride(name = "ssn", column = #Column(name = "SSN", length = 11))
})
private EmbeddedEmployeePK pk;
See "2.2.2.4. Embedded objects (aka components)" in the Hibernate Annotations documentation.
In order to preserve other #Column properties (such as name and nullable) keep them on the overridden column the same as you specified on the original column.
I need to allow client users to extend the data contained by a JPA entity at runtime. In other words I need to add a virtual column to the entity table at runtime. This virtual column will only be applicable to certain data rows and there could possibly be quite a few of these virtual columns. As such I don't want to create an actual additional column in the database, but rather I want to make use of additional entities that represent these virtual columns.
As an example, consider the following situation. I have a Company entity which has a field labelled Owner, which contains a reference to the Owner of the Company. At runtime a client user decides that all Companies that belong to a specific Owner should have the extra field labelled ContactDetails.
My preliminary design uses two additional entities to accomplish this. The first basically represents the virtual column and contains information such as the field name and type of value expected. The other represents the actual data and connects an entity row to a virtual column. For example, the first entity might contain the data "ContactDetails" while the second entity contains say "555-5555."
Is this the right way to go about doing this? Is there a better alternative? Also, what would be the easiest way to automatically load this data when the original entity is loaded? I want my DAO call to return the entity together with its extensions.
EDIT: I changed the example from a field labelled Type which could be a Partner or a Customer to the present version as it was confusing.
Perhaps a simpler alternative could be to add a CLOB column to each Company and store the extensions as an XML. There is a different set of tradeoffs here compared to your solution but as long as the extra data doesn't need to be SQL accessible (no indexes, fkeys and so on) it will probably be simple than what you do now.
It also means that if you have some fancy logic regarding the extra data you would need to implement it differently. For example if you need a list of all possible extension types you would have to maintain it separately. Or if you need searching capabilities (find customer by phone number) you will require lucene or similar solution.
I can elaborate more if you are interested.
EDIT:
To enable searching you would want something like lucene which is a great engine for doing free text search on arbitrary data. There is also hibernate-search which integrates lucene directly with hibernate using annotations and such - I haven't used it but I heard good things about it.
For fetching/writing/accessing data you are basically dealing with XML so any XML technique should apply. The best approach really depends on the actual content and how it is going to be used. I would suggest looking into XPath for data access, and maybe look into defining your own hibernate usertype so that all the access is encapsulated into a class and not just plain String.
I've run into more problems than I hoped I would and as such I decided to dumb down the requirements for my first iteration. I'm currently trying to allow such Extensions only on the entire Company entity, in other words, I'm dropping the whole Owner requirement. So the problem could be rephrased as "How can I add virtual columns (entries in another entity that act like an additional column) to an entity at runtime?"
My current implementation is as follow (irrelevant parts filtered out):
#Entity
class Company {
// The set of Extension definitions, for example "Location"
#Transient
public Set<Extension> getExtensions { .. }
// The actual entry, for example "Atlanta"
#OneToMany(fetch = FetchType.EAGER)
#JoinColumn(name = "companyId")
public Set<ExtensionEntry> getExtensionEntries { .. }
}
#Entity
class Extension {
public String getLabel() { .. }
public ValueType getValueType() { .. } // String, Boolean, Date, etc.
}
#Entity
class ExtensionEntry {
#ManyToOne(fetch = FetchType.EAGER)
#JoinColumn(name = "extensionId")
public Extension getExtension() { .. }
#ManyToOne(fetch = FetchType.LAZY)
#JoinColumn(name = "companyId", insertable = false, updatable = false)
public Company getCompany() { .. }
public String getValueAsString() { .. }
}
The implementation as is allows me to load a Company entity and Hibernate will ensure that all its ExtensionEntries are also loaded and that I can access the Extensions corresponding to those ExtensionEntries. In other words, if I wanted to, for example, display this additional information on a web page, I could access all of the required information as follow:
Company company = findCompany();
for (ExtensionEntry extensionEntry : company.getExtensionEntries()) {
String label = extensionEntry.getExtension().getLabel();
String value = extensionEntry.getValueAsString();
}
There are a number of problems with this, however. Firstly, when using FetchType.EAGER with an #OneToMany, Hibernate uses an outer join and as such will return duplicate Companies (one for each ExtensionEntry). This can be solved by using Criteria.DISTINCT_ROOT_ENTITY, but that in turn will cause errors in my pagination and as such is an unacceptable answer. The alternative is to change the FetchType to LAZY, but that means that I will always "manually" have to load ExtensionEntries. As far as I understand, if, for example, I loaded a List of 100 Companies, I'd have to loop over and query each of those, generating a 100 SQL statements which isn't acceptable performance-wise.
The other problem which I have is that ideally I'd like to load all the Extensions whenever a Company is loaded. With that I mean that I'd like that #Transient getter named getExtensions() to return all the Extensions for any Company. The problem here is that there is no foreign key relation between Company and Extension, as Extension isn't applicable to any single Company instance, but rather to all of them. Currently I can get past that with code like I present below, but this will not work when accessing referenced entities (if for example I have an entity Employee which has a reference to Company, the Company which I retrieve through employee.getCompany() won't have the Extensions loaded):
List<Company> companies = findAllCompanies();
List<Extension> extensions = findAllExtensions();
for (Company company : companies) {
// Extensions are the same for all Companies, but I need them client side
company.setExtensions(extensions);
}
So that's were I'm at currently, and I have no idea how to proceed in order to get past these problems. I'm thinking that my entire design might be flawed, but I'm unsure of how else to try and approach it.
Any and all ideas and suggestions are welcome!
The example with Company, Partner, and Customer is actually good application for polymorphism which is supported by means of inheritance with JPA: you will have one the following 3 strategies to choose from: single table, table per class, and joined. Your description sounds more like joined strategy but not necessarily.
You may also consider just one-to-one( or zero) relationship instead. Then you will need to have such relationship for each value of your virtual column since its values represent different entities. Hence, you'll have a relationship with Partner entity and another relationship with Customer entity and either, both or none can be null.
Use pattern decorator and hide your entity inside decoratorClass bye
Using EAV pattern is IMHO bad choice, because of performance problems and problems with reporting (many joins). Digging for solution I've found something else here: http://www.infoq.com/articles/hibernate-custom-fields