I'm struggling to identify the right annotations to use to map a star schema with Spring Boot JPA.
Let's consider 3 tables:
DIM_One (1)--------(1..n) FACT (1..n) ------------ (1) DIM_Two
DIM_One and DIM_Two both have an id that is their primary key.
FACT's primary key is the combination of (DIM_One_pk, DIM_Two_pk)
For now, the annotations in my DIM tables are similar to :
#Table(name="DIM_One")
#Entity
#Getter
#ToString
public class One {
#Id
#Column(name = "dim_one_id")
private UUID id;
//...
}
As for the FACT table, I have :
#Entity
#Table(name = "FACT")
#ToString
#Getter
public class Fact {
#EmbeddedId
private FactId id;
//...
}
with the corresponding FactId class :
#Embeddable
#Getter
#EqualsAndHashCode
public class FactId implements Serializable {
private One one;
private Two two;
}
I feel a bit lost with the right annotations I would need to use to make it correspond to the cardinality:
DIM_One (1)--------(1..n) FACT (1..n) ------------ (1) Dim_Two
Furthermore, should it actually be mapped as OneToMany or OneToOne ?
Your diagram shows the (1..n)---(1) relationship so it should be mapped like this.
Other then that you need to think about how you want to use this:
If loading a fact, do you want to load the associated dimension entries? This leads to the decision between eager and lazy loading.
Do you want to be able to navigate from fact to dimension or the other way round? Or both? This leads to the decision about directionality.
If you persist, delete ... a fact should dimensions join in that operation? => cascade configuration.
Note: While in principle this should work without major problem, since a star schema is still just a bunch of tables it sounds like a really bad idea.
Star schema are used for large amounts of data and are highly denormalised in order to optimise for reads and aggregations.
This means updates typically hit from a few hundred rows to many thousands, possibly millions.
JPA is not build for this kind of operation and will perform horrible compared to specifically taylored SQL statements.
On the read side you'll constantly operate with aggregate functions and probably windowing functions with non trivial expressions. JPQL, the query language of JPA again is not build for this and will severely limit your options.
Related
I am looking for an answer if it is possible or not in hibernate. What I am trying to achieve is that if a field exists in a particular table then only it should insert it. Else just ignore the field in the #Entity class.
I want this as a new field is going to introduce in one of the table we are using and there are many dependent components which right now insert the data into that table. I don't want a big bang release. Want something like it doesn't impact the older version as well as when the upgrade happens and a new column introduced then also it should work.
For example -
#Entity
#Table(name = "EMPLOYEE_RECORDS")
public class Employee
{
#Id
#Column(name = "employee_id")
private Integer employeeId;
#Column(name = "employee_name")
private String employeeName;
#Column(name="address")
private String address;
}
What if I only want to insert address field into DB only when column(address) exists in the table EMPLOYEE_RECORDS. Please forgive me if this is something obvious, as I am not very proficient in Hibernate.
Also let me explain what have I thought of (But not sure if it will also work) -
1. Create 2 different #Entity classes. Try to insert and if the insertion failed then at runtime switch the #Entity and use without address.
2. Check if field exist in the table by simple query if it fails use #Entity without address else use without address.
I'm very confused about the scenario - It seems like there were deeper issues regarding decoupling of components in your system.
Nevertheless, you can add the column in the database, but you don't need to declare the field in the hibernate entity. On the other hand there is no way you can have an optional field in an hibernate entity. Either a field is mapped or it is not mapped.
I'm designing a solution for dealing with complex structure (user related stuff with lots of relations) in a simplier and possibly more efficient way than getting all the related data from DB. The only part of data I really need in my use case is basically contained withing the non-relational 'main' entity fields.
As for now I extracted the basic fields from 'main' class (let it be class OldMain) to another class (let's call it abstract class Extracted), used #MappedSuperclass and created 2 classes that extends it- Basic (which is empty class as Extracted gives it all the data I need and mapped to table 'X') and Extended (which is also mapped to table 'X' but with all the extra relations). It basically works but the code structure looks odd and makes me thinking if that's a proper way of dealing with such a problem.
I also tried with lazy initiation on relational fields (which i guessed on the beginning would serve here well), but I wasn't able to get it to work as I wanted with Jackson (only non-lazy fields in JSON, without fetching lazy related data- it couldn't be serialized or fired another several dozen of relation queries).
Another thing i stumbled upon in some tutorial was making DTO from 'OldMain' entity to not touch the lazy fields but haven't tried it yet as I started with the #MappedSuperClass way.
#Table(name = "X")
#MappedSuperclass
public abstract class Extracted{
//all the non-relational fields from OldMain
#Id
#GeneratedValue(strategy = GenerationType.IDENTITY)
private Integer id;
private String name;
private String surname;
private String userName;
private String email;
}
#Table(name = "X")
#Entity
public class Basic extends Extracted{
//empty
}
#Table(name = "X")
#Entity
public class Extended extends Extracted{
//all relational fields from OldMain, no data fields
}
Also the general question is- is there any good practices when dealing with need of using only a subset of a bigger entity?
There is no obligation for a JPA Entity to map all existing columns in the corresponding table in the database. That is, given a table my_entity with columns col1, col2, col3, the Entity mapped to this table could map only col1 and col2 and ignore col3. That being said, plus the fact that you only need the non-relational attributes, you could directly use your Extracted class with the attributes you need and ignore the fact that other relational field exists. Furthermore, if all the relational fields are nullable then you could even be able to persist new instances of Extracted class. And Jackson would only (un)marshal the declared attributes in Extracted class.
In other case, I suggest to follow the approach you already are in and define new Entity classes that extend your Extracted class with the required attributes. I don't see how "code structure looks odd", other than having a Basic class with no new attributes than Extracted - you could easily make Extracted non-abstract and use it directly, and get rid of Basic.
I am currently building a persistence layer for a number of data classes that I can NOT change. These classes have no Id property/field which makes them ill suited for being used in ORM.
A best case scenario for me would have been some sort of auto-generated Ids that would only be present inside the database to set the objects in relation to each other. Sadly this does not seem to be possible using the JPA apis.
Since the above approach did not work out, I decided on trying to use simple wrapper #Entity objects like so:
#Entity
public class ThirdPartyObjectWrapper {
#Id private long id;
#Embedded private ThirdPartyObject myThirdPartyObject;
}
This approach works out nicely in the database, but I am having problems getting the object out of the wrapper and into its place inside another third party object.
public class AnotherThirdPartyObject {
private ThirdPartyObject object; //Actually in a Many-To-One-Relationship
}
Because they are third party objects I'm mapping them through the orm.xml file defining the relationships there. At this point in time the relationship mapping looks like so:
<many-to-one name="object"
target-entity="ThirdPartyObjectWrapper"/>
But with this setup Hibernate tries to insert the ThirdPartyObjectWrapper.id into the AnotherThirdPartyObject.object field, which obviously fails.
My question now is:
Is what I am trying even possible?
I'm using Hibernate 4.2.3 and I have a class similar to the following:
#Entity
#DynamicInsert
#DynamicUpdate
#SelectBeforeUpdate
public class Test {
#Id
private BigInteger theId;
#Lob
#Basic(fetch = FetchType.LAZY)
#JsonIgnore
private Blob data;
#Lob
#Basic(fetch = FetchType.LAZY)
#JsonIgnore
private Blob otherData;
// Getters and setters....
}
The sql that this is generating for an update includes the data column, even though it hasn't changed. (To be precise, what I do is get the object, detach it, read the data and use that to generate otherData, set that and then call saveOrUpdate on the session.)
Can anyone explain why this would happen? Does this functionality work with Blobs? I've searched for documentation but found none.
PS I'm not using #DynamicUpdate for performance reasons. I know that it would be questionable to use it from that standpoint.
The safest and most portable (between different databases and JPA providers) way to achieve real lazy loading of Lobs is to create an artificial lazy one-to-one association between the original entity and a new one to which you move the Lob.
This approach is suitable for other kinds of optimizations as well, for example when I want to enable second-level caching of a complex entity, but a few columns of the entity are updated frequently. Then I extract those columns to a separate non-second-level-cacheable entity.
However, keep in mind general pitfalls specific to one-to-one associations. Basically, either map it with a mandatory (optional = false) one-to-one association with #PrimaryKeyJoinColumn or make sure the foreign key is in the entity (table) which declares the lazy association (in this case the original entity from which the Lob is moved out). Otherwise, the association could be effectively eager, thus defeating the purpose of introducing it.
Preliminary Info
I'm currently trying to integrate Hibernate with my team at work. We primarily do Java web development, creating webapps that provide data to clients. Our old approach involves calling stored procedures with JDBC (on top of Oracle boxes) and storing their results in beans. However, I've heard a lot about the benefits of integrating Hibernate into a development environment like ours so I'm attempting to move away from our old habits. Note: I'm using the Hibernate JPA annotation approach due to simplicity for team adoption's sake.
Specific Problem
The specific issue I'm having currently is using Hibernate with normalized tables. We have a lot of schemas structured like so:
StateCodes (integer state_code, varchar state_name)
Businesses (integer business_id, varchar business_name, integer state_code)
I want to be able to have a single #Entity that has all of the "Businesses" fields, except instead of "state_code" it has "state_name". To my understanding, Hibernate treats #Entity classes as tables. The #OneToMany, #OneToOne, #ManyToOne annotations create relationships between entities, but this is a very simplistic, dictionary-like lookup and I feel like it doesn't apply here (or might be overkill).
One approach I've seen is
#Formula("(select state_name from StateCodes where Businesses.state_code = state_code)")
private String stateCode;
But, given Hibernate's perk of "avoiding writing raw SQL", this seems like bad practice. Not to mention, I'm extremely confused about how Hibernate will then treat this field. Does it get saved on a save operation? It's just defined as a query, not a column, after all.
So what is the best way to accomplish this?
I do not see any reason not use the standard JPA mappings in this case. Short of creating a database view and mapping an entity to that (or using the non-JPA compliant #Formula) then you will have to map as below.
Unless you are providing a means for the State to be changed then you do not need to expose the State entity to the outside world: JPA providers do not need getters/setters to be present.. Neither do you need to Map a State to Businesses:
#Entity
#Table(name = "Businesses")
public class Business{
//define id and other fields
#ManyToOne
#JoinColumn(name = "state_code")
private State state;
public String getStateName(){
return state.getName();
}
}
#Entity
#Table(name="StateCodes")
public class State{
//define id and other fields.
#Column(name = "state_name")
private String stateName;
public String getStateName(){
return stateName;
}
}