Given the following domain model, I want to load all Answers including their Values and their respective sub-children and put it in an AnswerDTO to then convert to JSON. I have a working solution but it suffers from the N+1 problem that I want to get rid of by using an ad-hoc #EntityGraph. All associations are configured LAZY.
#Query("SELECT a FROM Answer a")
#EntityGraph(attributePaths = {"value"})
public List<Answer> findAll();
Using an ad-hoc #EntityGraph on the Repository method I can ensure that the values are pre-fetched to prevent N+1 on the Answer->Value association. While my result is fine there is another N+1 problem, because of lazy loading the selected association of the MCValues.
Using this
#EntityGraph(attributePaths = {"value.selected"})
fails, because the selected field is of course only part of some of the Value entities:
Unable to locate Attribute with the the given name [selected] on this ManagedType [x.model.Value];
How can I tell JPA only try fetching the selected association in case the value is a MCValue? I need something like optionalAttributePaths.
You can only use an EntityGraph if the association attribute is part of the superclass and by that also part of all subclasses. Otherwise, the EntityGraph will always fail with the Exception that you currently get.
The best way to avoid your N+1 select issue is to split your query into 2 queries:
The 1st query fetches the MCValue entities using an EntityGraph to fetch the association mapped by the selected attribute. After that query, these entities are then stored in Hibernate's 1st level cache / the persistence context. Hibernate will use them when it processes the result of the 2nd query.
#Query("SELECT m FROM MCValue m") // add WHERE clause as needed ...
#EntityGraph(attributePaths = {"selected"})
public List<MCValue> findAll();
The 2nd query then fetches the Answer entity and uses an EntityGraph to also fetch the associated Value entities. For each Value entity, Hibernate will instantiate the specific subclass and check if the 1st level cache already contains an object for that class and primary key combination. If that's the case, Hibernate uses the object from the 1st level cache instead of the data returned by the query.
#Query("SELECT a FROM Answer a")
#EntityGraph(attributePaths = {"value"})
public List<Answer> findAll();
Because we already fetched all MCValue entities with the associated selected entities, we now get Answer entities with an initialized value association. And if the association contains an MCValue entity, its selected association will also be initialized.
I don't know what Spring-Data is doing there, but to do that, you usually have to use the TREAT operator to be able to access the sub-association but the implementation for that Operator is quite buggy.
Hibernate supports implicit subtype property access which is what you would need here, but apparently Spring-Data can't handle this properly. I can recommend that you take a look at Blaze-Persistence Entity-Views, a library that works on top of JPA which allows you map arbitrary structures against your entity model. You can map your DTO model in a type safe way, also the inheritance structure. Entity views for your use case could look like this
#EntityView(Answer.class)
interface AnswerDTO {
#IdMapping
Long getId();
ValueDTO getValue();
}
#EntityView(Value.class)
#EntityViewInheritance
interface ValueDTO {
#IdMapping
Long getId();
}
#EntityView(TextValue.class)
interface TextValueDTO extends ValueDTO {
String getText();
}
#EntityView(RatingValue.class)
interface RatingValueDTO extends ValueDTO {
int getRating();
}
#EntityView(MCValue.class)
interface TextValueDTO extends ValueDTO {
#Mapping("selected.id")
Set<Long> getOption();
}
With the spring data integration provided by Blaze-Persistence you can define a repository like this and directly use the result
#Transactional(readOnly = true)
interface AnswerRepository extends Repository<Answer, Long> {
List<AnswerDTO> findAll();
}
It will generate a HQL query that selects just what you mapped in the AnswerDTO which is something like the following.
SELECT
a.id,
v.id,
TYPE(v),
CASE WHEN TYPE(v) = TextValue THEN v.text END,
CASE WHEN TYPE(v) = RatingValue THEN v.rating END,
CASE WHEN TYPE(v) = MCValue THEN s.id END
FROM Answer a
LEFT JOIN a.value v
LEFT JOIN v.selected s
My latest project used GraphQL (a first for me) and we had a big issue with N+1 queries and trying to optimize the queries to only join for tables when they are required. I have found Cosium
/
spring-data-jpa-entity-graph irreplaceable. It extends JpaRepository and adds methods to pass in an entity graph to the query. You can then build dynamic entity graphs at runtime to add in left joins for only the data you need.
Our data flow looks something like this:
Receive GraphQL request
Parse GraphQL request and convert to list of entity graph nodes in the query
Create entity graph from the discovered nodes and pass into the repository for execution
To solve the problem of not including invalid nodes into the entity graph (for example __typename from graphql), I created a utility class which handles the entity graph generation. The calling class passes in the class name it is generating the graph for, which then validates each node in the graph against the metamodel maintained by the ORM. If the node is not in the model, it removes it from the list of graph nodes. (This check needs to be recursive and check each child as well)
Before finding this I had tried projections and every other alternative recommended in the Spring JPA / Hibernate docs, but nothing seemed to solve the problem elegantly or at least with a ton of extra code
Edited after your comment:
My apologize, I haven't undersood you issue in the first round, your issue occurs on startup of spring-data, not only when you try to call the findAll().
So, you can now navigate the full example can be pull from my github:
https://github.com/bdzzaid/stackoverflow-java/blob/master/jpa-hibernate/
You can easlily reproduce and fix your issue inside this project.
Effectivly, Spring data and hibernate are not capable to determinate the "selected" graph by default and you need to specify the way to collect the selected option.
So first, you have to declare the NamedEntityGraphs of the class Answer
As you can see, there is two NamedEntityGraph for the attribute value of the class Answer
The first for all Value without specific relationship to load
The second for the specific Multichoice value. If you remove this one, you reproduce the exception.
Second, you need to be in a transactional context answerRepository.findAll() if you want to fetch data in type LAZY
#Entity
#Table(name = "answer")
#NamedEntityGraphs({
#NamedEntityGraph(
name = "graph.Answer",
attributeNodes = #NamedAttributeNode(value = "value")
),
#NamedEntityGraph(
name = "graph.AnswerMultichoice",
attributeNodes = #NamedAttributeNode(value = "value"),
subgraphs = {
#NamedSubgraph(
name = "graph.AnswerMultichoice.selected",
attributeNodes = {
#NamedAttributeNode("selected")
}
)
}
)
}
)
public class Answer
{
#Id
#GeneratedValue(strategy = GenerationType.IDENTITY)
#Column(updatable = false, nullable = false)
private int id;
#OneToOne(cascade = CascadeType.ALL)
#JoinColumn(name = "value_id", referencedColumnName = "id")
private Value value;
// ..
}
Related
Problem
To make my code cleaner i want to introduce a generic Repository that each Repository could extend and therefore reduce the code i have to have in each of them. The problem is, that the Ids differ from Class to Class. On one (see example below) it would be id and in the other randomNumber and on the other may even be an #EmbeddedId. I want to have a derived (or non derived) query in the respository that gets One by id.
Preferred solution
I Imagine having something like:
public interface IUniversalRepository<T, K>{
#Query("select t from # {#entityName} where #id = ?1")
public T findById(K id);
}
Ecample Code
(that does not work because attribute id cannot be found on Settings)
public interface IUniversalRepository<T, K>{
//should return the object with the id, reagardless of the column name
public T findById(K id);
}
// two example classes with different #Id fields
public class TaxRate {
#Id
#Column()
private Integer id;
...
}
public class Settings{
#Id
#Column() //cannot rename this column because it has to be named exactly as it is for backup reason
private String randomNumber;
...
}
// the Repository would be used like this
public interface TaxRateRepository extends IUniversalRepository<TaxRate, Integer> {
}
public interface SettingsRepository extends IUniversalRepository<TaxRate, String> {
}
Happy for suggestions.
The idea of retrieving JPA entities via "id query" is not so good as you might think, the main problem is that is much slower, especially when you are hitting the same entity within transaction multiple times: if flush mode is set to AUTO (with is actually the reasonable default) Hibernate needs to perform dirty checking and flush changes into database before executing JPQL query, moreover, Hibernate doesn't guarantee that entities, retrieved via "id query" are not actually stale - if entity was already present in persistence context Hibernate basically ignores DB data.
The best way to retrieve entities by id is to call EntityManager#find(java.lang.Class<T>, java.lang.Object) method, which in turn backs up CrudRepository#findById method, so, yours findByIdAndType(K id, String type) should actually look like:
default Optional<T> findByIdAndType(K id, String type) {
return findById(id)
.filter(e -> Objects.equals(e.getType(), type));
}
However, the desire to place some kind of id placeholder in JQPL query is not so bad - one of it's applications could be preserving order stability in queries with pagination. I would suggest you to file corresponding CR to spring-data project.
I'm using a legacy database. In my example, we retrieve a product which have some characteristics. In the db, we can find a product table, a characteristic table and a jointable for the manyToMany association.
The only field i need is the label of the characteristics. So, my Product entity will contains a list of characteristics as String. I would like to not create to many entities in order to not overload my sourcecode. Let's see the example :
#Entity
#Table(name = "product")
public class Product implements Serializable {
#Id
#Column(name = "id")
private Long id;
// all field of Product entity
#ElementCollection(targetClass = String.class)
#Formula(value = "(SELECT characteristic.label FROM a jointable JOIN b characteristic ON jointable.characteristic_id = characteristic.id WHERE jointable.product_id = id)")
private Set<String> characteristics = new HashSet<>();
// Getter / setter
}
To represent my characteristics, i tried to use the association of #Formula and #ElementCollection. As you can see, the names of tables (a and b in the query) does not match with my representation of these datas.
But, when I try to load a product, I get an error like "PRODUCT_CHARACTERISTICS table not found".
Here the generated SQL query executed by hibernate :
SELECT product0_.id AS id1_14_0_,
-- Other fields
characteri10_.product_id AS product_1_15_1__,
(SELECT characteristic.label
FROM a jointable JOIN b characteristic ON jointable.characteristic_id = characteristic.id
WHERE jointable.product_id = id) AS formula6_1__,
FROM product product0_
-- Other Joins
LEFT OUTER JOIN product_characteristics characteri10_ ON product0_.cdprd = characteri10_.product_cdprd
WHERE product0_.id = ?;
In the FROM part, we can refind the call of product_characteristics table (which not exist in the database).
So, my main question is the following : How can I get the list of characterics as entity attribute ? Can I reach this result with #Formula ?
Edit
In other words, i would like to load only one attribute from Many to Many mapping. I found an example here but it works only with the id (which can find in the jointable)
I assume that what you want to achieve here is reducing the amount of data that is fetched for a use case. You can leave your many-to-many mapping as it is, since you will need DTOs for this and I think this is a perfect use case for Blaze-Persistence Entity Views.
I created the library to allow easy mapping between JPA models and custom interface or abstract class defined models, something like Spring Data Projections on steroids. The idea is that you define your target structure(domain model) the way you like and map attributes(getters) via JPQL expressions to the entity model.
A DTO model for your use case could look like the following with Blaze-Persistence Entity-Views:
#EntityView(Product.class)
public interface ProductDto {
#IdMapping
Long getId();
String getName();
#Mapping("characteristics.label")
Set<String> getCharacteristicLabels();
}
Querying is a matter of applying the entity view to a query, the simplest being just a query by id.
ProductDto a = entityViewManager.find(entityManager, ProductDto.class, id);
The Spring Data integration allows you to use it almost like Spring Data Projections: https://persistence.blazebit.com/documentation/entity-view/manual/en_US/index.html#spring-data-features
Page<ProductDto> findAll(Pageable pageable);
The best part is, it will only fetch the state that is actually necessary!
I would like someone to explain me why Hibernate is making one extra SQL statement in my straight forward case. Basically i have this object:
#Entity
class ConfigurationTechLog (
#Id
#GeneratedValue(strategy = GenerationType.IDENTITY)
val id: Long?,
val configurationId: Long,
val type: String,
val value: String?
) {
#JsonIgnore
#ManyToOne(fetch = FetchType.LAZY)
#JoinColumn(name = "configurationId", insertable = false, updatable = false)
val configuration: Configuration? = null
}
So as you can see, nothing special there. And when i execute this query :
#Query(value = "SELECT c FROM ConfigurationTechLog c where c.id = 10")
fun findById10() : Set<ConfigurationTechLog>
In my console i see this:
Hibernate:
/* SELECT
c
FROM
ConfigurationTechLog c
where
c.id = 10 */ select
configurat0_.id as id1_2_,
configurat0_.configuration_id as configur2_2_,
configurat0_.type as type3_2_,
configurat0_.value as value4_2_
from
configuration_tech_log configurat0_
where
configurat0_.id=10
Hibernate:
select
configurat0_.id as id1_0_0_,
configurat0_.branch_code as branch_c2_0_0_,
configurat0_.country as country3_0_0_,
configurat0_.merchant_name as merchant4_0_0_,
configurat0_.merchant_number as merchant5_0_0_,
configurat0_.org as org6_0_0_,
configurat0_.outlet_id as outlet_i7_0_0_,
configurat0_.platform_merchant_account_name as platform8_0_0_,
configurat0_.store_type as store_ty9_0_0_,
configurat0_.terminal_count as termina10_0_0_
from
configuration configurat0_
where
configurat0_.id=?
Can someone please explain me, what is happening here ? From where this second query is coming from ?
I assume you are using Kotlin data class. The kotlin data class would generate toString, hashCode and equals methods utilizing all the member fields. So if you are using the returned values in your code in a way that results in calling of any of these method may cause this issue.
BTW, using Kotlin data claases is against the basic requirements for JPA Entity as data classes are final classes having final members.
In order to make an association lazy, Hibernate has to create a proxy instance instead of using the real object, i.e. it needs to create an instance of dynamically generated subclass of the association class.
Since in Kotlin all classes are final by default, Hibernate cannot subclass it so it has to create the real object and initialize the association right away. In order to verify this, try declaring the Configuration class as open.
To solve this without the need to explicitly declare all entities open, it is easier to do it via the kotlin-allopen compiler plugin.
This Link can be useful for understand what kind (common) problem is that N + 1 Problem
Let me give you an example:
I have three Courses and each of them have Students related.
I would like to perform a "SELECT * FROM Courses". This is the first query that i want (+ 1) but Hibernate in background, in order to get details about Students for each Course that select * given to us, will execute three more queries, one for each course (N, there are three Course coming from select *). In the end i will see 4 queries into Hibernate Logs
Considering the example before, probably this is what happen in your case: You execute the first query that you want, getting Configuration Id = 10 but after, Hibernate, will take the entity related to this Configuration, then a new query is executed to get this related entity.
This problem should be related in specific to Relationships (of course) and LAZY Fetch. This is not a problem that you have caused but is an Hibernate Performance Issue with LAZY Fetch, consider it a sort of bug or a default behaviour
To solve this kind of problem, i don't know if will be in your case but ... i know three ways:
EAGER Fetch Type (but not the most good option)
Query with JOIN FETCH between Courses and Students
Creating EntityGraph Object that rappresent the Course and SubGraph that rappresent Students and is added to EntityGraph
Looking at your question, it seems like an expected behavior.
Since you've set up configuration to fetch lazily with #ManyToOne(fetch = FetchType.LAZY), the first sql just queries the other variables. When you try to access the configuration object, hibernate queries the db again. That's what lazy fetching is. If you'd like Hibernate to use joins and fetch all values at once, try setting #ManyToOne(fetch = FetchType.EAGER).
What is a best practice to store 'large' data, represented by List in Java, in database?
i'm considering 3 variants:
Use '#OneToMany' to store data in separate table.
Serialize data and store it in parent table.
Store data as files(naming conventions? same as id?).
To be more specific
'Large' data entities:
class SingleSleeper{
private Double startPositionOnLeft;
private Double endPositionOnLeft;
private Double startPositionOnRight;
private Double endPositionOnRight;
....
}
class RutEntry{
private Double width;
private Double position;
...
}
There are about 50 instances of SingleSleeper class and about 25000 instances of RutEntry class in one parent instance. Parent instances are generated about 40 times every day.
i'm using EclipseLink JPA 2.1, derby
Addition
Most of all i'm interested in best readability in Java. But i'm afraid that database speed will significantly decrease if i will store too much data into database. An overwhelming number of requests will be to select all instances of SingleSleeper or RutEntry classes of particular parent entity. I'm not interested for support to different database types, but i can move to other database, if needed.
I think I would do neither of your variants.
I would add a ManyToOne to the child entities (which is somehow the opposite of your first variant):
public class SingleSleeper {
#ManyToOne(optional = false, fetch = FetchType.LAZY)
private ParentEntity parent;
...
}
public class RutEntry {
#ManyToOne(optional = false, fetch = FetchType.LAZY)
private ParentEntity parent;
}
This ensures that you have a mapping and that you never load all 25000 entities for a parent object, if you don't need them (the lazy fetch ensures that you even don't need to load the parent entity).
You can create a OneToMany in the parent object with a mappedBy link, if you really want to. For example because you always need all child objects in the parent entity:
class ParentEntity {
#OneToMany(mappedBy = "parent", fetch = FetchType.LAZY)
Collection<SingleSleeper> singleSleepers;
#OneToMany(mappedBy = "parent", fetch = FetchType.LAZY)
Collection<RutEntry> rutEntries;
}
But I don't know how EclipseLink is working here - for Hibernate you need at least an additional BatchSize annotation to indicate that it should load as many child entities as possible at once. It can't fetch all together with the parent instance (e.g. by defining both as FetchType.EAGER), as only one is allowed to be fetched eagerly (and otherwise you would have 25000 * 50 result rows in the result set of the corresponding SQL select statement).
The best to load all child entities for a parent entity is to load them separate, either using JPQL (easier to read, faster to write) or the Criteria API (typesafe, but you need a metamodel):
ParentEntity parent = entityManager.find(ParentEntity.class, id);
// JPQL:
List<SingleSleeper> singleSleepers = entityManager.createQuery(
"SELECT s FROM SingleSleeper s WHERE s.parent = %parent"
).setParameter("parent", parent).getResultList();
// Or Criteria API:
CriteriaBuilder criteriaBuilder = entityManager.getCriteriaBuilder();
CriteriaQuery<SingleSleeper> query = criteriaBuilder.createQuery(SingleSleeper.class);
Root<SingleSleeper> s = query.from(SingleSleeper.class);
query.select(s).where(criteriaBuilder.equal(s.get(SingleSleeper_.parent), parent));
List<SingleSleeper> singleSleepers = entityManager.createQuery(query).getResultList();
You have three advantages of that approach:
Still easy to read - if you put the loading into its own method.
You are flexible to decide when to load the 25050 children.
You can load a subset of the children as well (by modifying the result of createQuery with Query.setFirstResult and Query.setMaxResults).
I would like to know if it is possible to configure dynamic #where clauses with JPA anotations. Let's say I have a relationship between two classes - company (parent) and invoces (child).
If have a Annotation declaration like this (pseudo code).
#ManyToOne
#JoinColumn(name = "ID_MEN_PRE", referencedColumnName = "ID_MEN"
#Where(clause="invoice_month=:month")
private Testinvoices invoice;
What I would like to do now is to pass "month" value in the where clause. The result should return only Testinvoices by the given Date.
Is there a way how to do it?
No, however Filters - see further below - can be parameterized.
However it looks to me like you are attempting to model a relationship when actually a query would be a better solution.
Or, create a view filtered by date at the database Level and map invoices to that view rather than a table.
17.1. Hibernate filters
Hibernate3 has the ability to pre-define filter criteria and attach
those filters at both a class level and a collection level. A filter
criteria allows you to define a restriction clause similar to the
existing "where" attribute available on the class and various
collection elements. These filter conditions, however, can be
parameterized. The application can then decide at runtime whether
certain filters should be enabled and what their parameter values
should be. Filters can be used like database views, but they are
parameterized inside the application.
This is not possible in JPA.
For same effect you can use criteria. Please refer to example below
Criteria criteria = session.createCriteria(Employees.class);
if (subsidiaryId != null) {
criteria.add(Restrictions.eq("subsidiaryId", subsidiaryId));
}
if (employeeId != null) {
criteria.add(Restrictions.eq("employeeId", employeeId));
}
if (lastName != null) {
criteria.add(
Restrictions.eq("lastName", lastName).ignoreCase()
);
}
It is not possible to passe a parameter during the runtime but it is possible to use it like this way
#ManyToOne
#JoinColumn(name = "ID_MEN_PRE", referencedColumnName = "ID_MEN"
#Where(clause="invoice_month=january")
private Testinvoices invoice;