Imagine, an Event entity references a Status Entity:
#Entity
#Table(name = "event")
public class Event()
{
#Id
#Column(name = "id", nullable = false)
private long id;
...
#ManyToOne
#JoinColumn(name = "status_code", nullable = false)
private Status status;
}
#Entity
#Table(name = "status")
public class Status()
{
#Id
#Column(name = "code", nullable = false)
private String code;
#Column(name = "label", nullable = false, updatable = false)
private String label;
}
Status is mapped to a small table 'status'. Status is a typical reference data / lookup Entity.
code label
----- --------------
CRD Created
ITD Initiated
PSD Paused
CCD Cancelled
ABD Aborted
I'm not sure if it is a good idea to model Status as an Entity. It feels more like an enumeration of constants...
By mapping Status as an Entity, I can use Status objects in Java code, and the Status values are equally present in the database. This is good for reporting.
On the other hand, if I want to set a particular Status to an Event, I can't simply assign the constant status I have in mind. I have to lookup the right entity first:
event.setStatus(entityManager.find(Status.class, "CRD"))
Can I avoid the above code fragment? I'm affraid for a performance penalty and it looks very heavy...
Do I have to tweak things with read-only attributes?
Can I prefetch these lookup entities and use them as constants?
Did I miss a crucial JPA feature?
...?
All opinions / suggestions / recommendations are welcome!
Thank you!
J.
You could use entityManager.getReference(Status.class, "CRD"), which might not fetch the entity from the database if it is only used to set a foreign key.
Can I avoid the above code fragment? I'm affraid for a performance penalty and it looks very heavy?
Well, you could use an enum instead. I don't really see why you don't actually.
But if you really want to use an entity, then it would be a perfect candidate for 2nd level caching and this would solve your performance concern.
Related
I need to load the Post entities along with the PostVote entity that represents the vote cast by a specific user (The currently logged in user). These are the two entities:
Post
#Entity
public class Post implements Serializable {
public enum Type {TEXT, IMG}
#Id
#GeneratedValue(strategy = GenerationType.IDENTITY)
protected Integer id;
#ManyToOne(fetch = FetchType.LAZY, optional = false)
#JoinColumn(name = "section_id")
protected Section section;
#ManyToOne(fetch = FetchType.LAZY, optional = false)
#JoinColumn(name = "author_id")
protected User author;
#Column(length = 255, nullable = false)
protected String title;
#Column(columnDefinition = "TEXT", nullable = false)
protected String content;
#Enumerated(EnumType.STRING)
#Column(nullable = false)
protected Type type;
#CreationTimestamp
#Column(nullable = false, updatable = false, insertable = false)
protected Instant creationDate;
/*accessor methods*/
}
PostVote
#Entity
public class PostVote implements Serializable {
#Embeddable
public static class Id implements Serializable{
#Column(name = "user_id", nullable = false)
protected int userId;
#Column(name = "post_id", nullable = false)
protected int postId;
/* hashcode, equals, getters, 2 args constructor */
}
#EmbeddedId
protected Id id;
#ManyToOne(optional = false)
#MapsId("postId")
protected Post post;
#ManyToOne(optional = false)
#MapsId("userId")
protected User user;
#Column(nullable = false)
protected Short vote;
/* accessor methods */
}
All the associations are unidirectional #*ToOne. The reason I don't use #OneToMany is because the collections are too large and need proper paging before being accessed: not adding the #*ToManyassociation to my entities means preventing anyone from naively doing something like for (PostVote pv : post.getPostVotes()).
For the problem i'm facing right now I've come with various solutions: none of them looks fully convincing to me.
1° solution
I could represent the #OneToMany association as a Map that can only be accessed by key. This way there is no issue caused by iterating over the collection.
#Entity
public class Post implements Serializable {
[...]
#OneToMany(mappedBy = "post")
#MapKeyJoinColumn(name = "user_id", insertable = false, updatable = false, nullable = false)
protected Map<User, PostVote> votesMap;
public PostVote getVote(User user){
return votesMap.get(user);
}
[...]
}
This solution looks very cool and close enough to DDD principles (i guess?). However, calling post.getVote(user) on each post would still cause a N+1 selects problem. If there was a way to efficiently prefetch some specific PostVotes for subsequent accesses in the session then it would be great. (Maybe for example calling from Post p left join fetch PostVote pv on p = pv.post and pv.user = :user and then storing the result in the L1 cache. Or maybe something that involves EntityGraph)
2° solution
A simplistic solution could be the following:
public class PostVoteRepository extends AbstractRepository<PostVote, PostVote.Id> {
public PostVoteRepository() {
super(PostVote.class);
}
public Map<Post, PostVote> findByUser(User user, List<Post> posts){
return em.createQuery("from PostVote pv where pv.user in :user and pv.post in :posts", PostVote.class)
.setParameter("user",user)
.setParameter("posts", posts)
.getResultList().stream().collect(Collectors.toMap(
res -> res.getPost(),
res -> res
));
}
}
The service layer takes the responsability of calling both PostRepository#fetchPosts(...) and then PostVoteRepository#findByUser(...), then mixes the results in a DTO to send to the presentation layer above.
This is the solution I'm currently using. However, I don't feel like having a ~50 parameters long in clause might be a good idea. Also, having a separate Repository class for PostVote may be a bit overkill and break the purpose of ORMs.
3° solution
I haven't tested it so it might have an incorrect syntax, but the idea is to wrap the Post and PostVote entity in a VotedPost DTO.
public class VotedPost{
private Post post;
private PostVote postVote;
public VotedPost(Post post, PostVote postVote){
this.post = post;
this.postVote = postVote;
}
//getters
}
I obtain the object with a query like this:
select new my.pkg.VotedPost(p, pv) from Post p
left join fetch PostVote pv on p = pv.post and pv.user = :user
This gives me more type safeness than the the solutions based on Object[] or Tuple query results. Looks like a better alternative than the solution 2 but adopting the solution 1 in a efficient way would be the best.
What is, generally, the best approach in problems like this? I'm using Hibernate as JPA implementation.
I could imagine the standard bi-directional association using #OneToMany being a maintainable yet performant solution.
To mitigate n+1 selects, one could use e.g.:
#EntityGraph, to specify which associated data is to be loaded (e.g. one user with all of it's posts and all associated votes within one single select query)
Hibernates #BatchSize, e.g. to load votes for multiple posts at once when iterating over all posts of a user, instead having one query for each collection of votes of each post
When it comes to restricting users to perform accesses in less performant ways, I'd argue that it should be up the API to document possible performance impacts and offer performant alternatives for different use-cases.
(As a user of an API one might always find ways to implement things in the least performant fashion:)
I'm currently implementing a doc with a like button like this:
The like button is associated with certain user account. When you press a like, it will stay liked for that user (similar to youtube video).
My entities and DTOs are below:
Doc.java:
#Entity(name = "Doc")
#Table(name = "doc")
#Data
public class Doc {
//Unrelated code reacted for clarity
#ManyToMany(cascade = {
CascadeType.MERGE,
CascadeType.PERSIST
})
#JoinTable(
name = "doc_user_dislike",
joinColumns = #JoinColumn(name = "doc_id"),
inverseJoinColumns = #JoinColumn(name = "user_id")
)
private Set<UserWebsite> dislikedUsers;
#ManyToMany(cascade = {
CascadeType.MERGE,
CascadeType.PERSIST
})
#JoinTable(
name = "doc_user_like",
joinColumns = #JoinColumn(name = "doc_id"),
inverseJoinColumns = #JoinColumn(name = "user_id")
)
private Set<UserWebsite> likedUsers;
}
User.java:
#Entity
#Table(name = "user_website")
#Data
public class UserWebsite {
//Unrelated code reacted for clarity
#ManyToMany(mappedBy = "likedUsers")
private Set<Doc> likedDocs;
#ManyToMany(mappedBy = "dislikedUsers")
private Set<Doc> dislikedDocs;
}
DocDetailsDTO.java (This will be sent to client).
#Data
public class DocDetailsDTO {
private Long id;
private Boolean isDisliked;
private Boolean isLiked;
}
I'm having some solutions:
Add a field called isLiked to Doc.java with #Formular combine with
#Transient and perform queries to DB.
Have another API which accept from Client a list of DocID, and a
UserID, then return a list of DocID that UserID liked.
Check if UserID exist in likedUsers list (not very efficient,
sometimes not feasible since I have to initialize that big
lazy-loaded list).
The question is: What is the most efficient way to retrieve liked/disliked status for many post at once (>10 doc but max 100 doc per request) for about thousand users (1000 CCU) at once ? Are above solutions already optimal ?
Any help is appreciated. Thanks for your time reading through the question.
If I understand the problem correctly, this approach is not correct. You want to determine if a given user likes specified documents, so the formula would need a user id parameter, which you have no way to pass to the formula. Even if somehow #Formula could be used, it leads to N+1 problem (extra query per each document). Plus, you use managed entities which means extra dirty checking at the end.
This one is optimal in my opinion - one query, capable of using projection (no managed entities).
As you notice, this will kill your application and database. Plus, again you use managed entities which means extra dirty checking at the end. Definitely don't use this one.
We are trying to save many child in a short amount of time and hibernate keep giving OptimisticLockException.
Here a simple exemple of that case:
University
id
name
audit_version
Student
id
name
university_id
audit_version
Where university_id can be null.
The java object look like:
#Entity
#Table(name = "university")
#DynamicUpdate
#Data
#Accessors(chain = true)
#EqualsAndHashCode(callSuper = true)
public class University {
#Id
#SequenceGenerator(name = "university_id_sequence_generator", sequenceName = "university_id_sequence", allocationSize = 1)
#GeneratedValue(strategy = SEQUENCE, generator = "university_id_sequence_generator")
#EqualsAndHashCode.Exclude
private Long id;
#Column(name = "name")
private String name;
#Version
#Column(name = "audit_version")
#EqualsAndHashCode.Exclude
private Long auditVersion;
#OptimisticLock(excluded = true)
#OneToMany(mappedBy = "student")
#ToString.Exclude
private List<Student> student;
}
#Entity
#Table(name = "student")
#DynamicUpdate
#Data
#Accessors(chain = true)
#EqualsAndHashCode(callSuper = true)
public class Student {
#Id
#SequenceGenerator(name = "student_id_sequence_generator", sequenceName = "student_id_sequence", allocationSize = 1)
#GeneratedValue(strategy = SEQUENCE, generator = "student_id_sequence_generator")
#EqualsAndHashCode.Exclude
private Long id;
#Column(name = "name")
private String name;
#Version
#Column(name = "audit_version")
#EqualsAndHashCode.Exclude
private Long auditVersion;
#OptimisticLock(excluded = true)
#ManyToOne(fetch = FetchType.LAZY)
#JoinColumn(name = "university_id")
#ToString.Exclude
private University university;
}
It seem when we assign university and then save Student, if we do more than 4 in a short amount of time we will get the OptimisticLockException.
It seem hibernate is creating update version on the University table even though the University didn't change at the db level.
UPDATE: code that save the student
Optional<University> universityInDB = universidyRepository.findById(universtityId);
universityInDB.ifPresent(university -> student.setUniversity(university);
Optional<Student> optionalExistingStudent = studentRepository.findById(student);
if (optionalExistingStudent.isPresent()) {
Student existingStudent = optionalExistingStudent.get();
if (!student.equals(existingStudent)) {
copyContentProperties(student, existingStudent);
studentToReturn = studentRepository.save(existingStudent);
} else {
studentToReturn = existingStudent;
}
} else {
studentToReturn = studentRepository.save(student);
}
private static final String[] IGNORE_PROPERTIES = {"id", "createdOn", "updatedOn", "auditVersion"};
public void copyContentProperties(Object source, Object target) {
BeanUtils.copyProperties(source, target, Arrays.asList(IGNORE_PROPERTIES)));
}
We tried the following
#OptimisticLock(excluded = true)
Doesn't work, still give the optimistic lock exception.
#JoinColumn(name = "university_id", updatable=false)
only work on a update since we don't save on the update
#JoinColumn(name = "university_id", insertable=false)
work but don't save the relation and university_id is always null
Change the Cascade behaviour.
The only one value that seem to made sense was Cascade.DETACH, but give a org.springframework.dao.InvalidDataAccessApiUsageException: org.hibernate.TransientPropertyValueException: object references an unsaved transient instance - save the transient instance before flushing.
Other solution we though of but are not sure what to pick
Give the client a 409 (Conflict) error
After the 409 the client must retry his post.
for a object sent via the queue the queue will retry that entry
later.
We don't want our client to manage this error
Retry after a OptimisticLockException
It's not clean since when the entry come from the queue we already doing it but might be the best solution so far.
Make the parent owner of the relationship
This might be fine if there are not a big number of relation, but we have case that might go in the 100 even in the 1000, which
will
make the object to big to be sent on a queue or via a Rest call.
Pessimistic Lock
Our whole db is currently in optimisticLocking
and we managed to prevent these case of optimisticLocking so far, we
don't want to change our whole locking strategy just because of this
case. Maybe force pessimistic locking for that subset of the model
but I haven't look if it can be done.
It does NOT need it unless you need it.
Do this:
University universityProxy = universidyRepository.getOne(universityId);
student.setUniversity(universityProxy);
In order to assign a University you don't have to load a University entity into the context. Because technically, you just need to save a student record with a proper foreign key (university_id). So when you have a university_id, you can create a Hibernate proxy using the repository method getOne().
Explanation
Hibernate is pretty complex under the hood. **When you load an entity to the context, it creates a snapshot copy of its fields and keeps track if you change any of it**. It does much more... So I guess this solution is the simplest one and it should help (unless you change the `university` object somewhere else in the scope of the same session). It's hard to say when other parts are hidden.
Potential issues
wrong #OneToMany mapping
#OneToMany(mappedBy = "student") // should be (mappedBy = "university")
#ToString.Exclude
private List<Student> student;
the collection should be initialized. Hibernate uses it's own impls of collections, and you should not set fields manually. Only call methods like add() or remove(), or clear()
private List<Student> student; // should be ... = new ArrayList<>();
*overall some places are not clear, like studentRepository.findById(student);. So if you want to have a correct answer it's better to be clear in your question.
If you enable your query logs from Hibernate, it would be worthwhile to see the queries that your ORM is performing. You'll likely realize that your ORM is doing too much.
In your application properties or config file enable hibernate.show_sql=true
I wouldn't be surprised if your single update to a Student becomes an update to a University which becomes an update to all of its containing Students. Everything gets a version bump.
ORM and entity mappings are for strategically retrieving data. They should not be used to actually define object relationships.
You'll want to visit strategies and design your entities based on how they are used in their REST endpoints.
You specified in your question that you are trying to save a Student but you're noticing that the University also gets updated along with every Student update.
Likely there would never be a time when a Student should ever update a University
Keep your entities lean!
You can structure your entity in such a way that supports this unidirectional relationship. I removed some of the annotation just to demonstrate the structure. You will want to keep in mind that when creating entities, you are writing them for how they are retrieved...
public class University {
#Id
private Long id;
private String name;
private Long auditVersion;
#OneToMany
private List<Student> student;
}
public class Student {
#Id
private Long id;
private String name;
private Long auditVersion;
private Long universityId;
}
This will ensure that updates to the student remains targeted and clean. You are simply assigning a university id to the student therefore establishing that relationship.
You typically want to respect LockExceptions. Retrying upon a LockException is simply bullying your database into submission and will cause more headaches as your application scales.
You always have the option to work with lean entities and create custom response or message objects that would zip the results together.
ORMs are not to be used to create shortcuts
The performance consequence of a SELECT on an indexed/foreign key is roughly the same cost of grabbing them joined... you only introduce a little extra network latency. A second trip to the database is not always a bad idea. (Often times, this is exactly how Hibernate fetches your entities)
You won't have to write queries, but you will still need to understand the retrieval and update strategies.
You're sacrificing database performance and introducing complexity for a convenient .getChild() method. You'll find that you resolve more performance/locking issues by removing annotations, not adding them.
I'm currently a little blocked with this and I can't see it clearly.
So I hope one of you have good idea's to help me.
The important code at the moment :
#Entity
#Table(name = "T_NOTA_RECIPIENT")
public class NotaRecipient extends PersistentEntity {
#Id
#Column(name = "NOTA_RECIPIENT_SID")
#GeneratedValue(strategy = GenerationType.AUTO)
private Integer id;
#Column(name = "STATUS", insertable = true, updatable = true)
#Enumerated(EnumType.STRING)
private Status status = Status.NEW;
#ManyToOne
#JoinColumn(name = "NOTA_SID", referencedColumnName = "NOTA_SID", nullable = false)
private Nota nota;
#ManyToOne
#JoinColumn(name = "CREATOR_OFFICE_SID", referencedColumnName = "OFFICE_SID", nullable = false)
private Office creator;
#OneToMany(fetch = FetchType.EAGER, mappedBy = "notaRecipient")
private Set<FollowUp> followUps;
...
}
Now, actually I don't want to load all the FollowUp who are in the DB but just the one of the current user.
But the problem is that I want to include the FollowUp so I can do database paging query.
We use hibernate, Spring Data and Query DSL with BooleanBuilder to "refine" our search.
I was thinking of using #Formula but this need to be a constant String so I can't include current userId in that.
Second solution could be setting the FollowUp as #Transient and fetch it myself in the DB and set it in mine service.
Problem here is that I can't use it as filter then or ordering by it.
#Formula doesn't have so much documentation, so is it possible to make a #Transient user and use that in the #Formula?
I asked some colleagues but they couldn't help me.
So then it's the time for asking here.
I can get the current user in the API, so that's no problem.
Anybody have alternative solutions?
You can define a mapping with expression
#JoinColumnOrFormula(formula=#JoinFormula(value="(SELECT f.id
FROM follow_up_table f
WHERE f.nota_id=id
and f.user_id={USER_ID})",
referencedColumnName="...")
And then add hibernate interceptor (see the example) and change the SQL on fly replacing {USER_ID} with real value in the
/**
* Called when sql string is being prepared.
* #param sql sql to be prepared
* #return original or modified sql
*/
public String onPrepareStatement(String sql);
I have 2 entities User and Profile where one user has one profile.
The User entity mapping is pretty clear and looks like this:
#Entity
public class User {
#Id
private String id;
#Column(unique = true, nullable = false)
private String email;
#Column(unique = true, nullable = false)
private String name;
}
So the question is about Profile entity mapping, the tricky thing
here is that Profile includes user's email(not entire user's entity), but it shouldn't be either updated or stored by Profile, so the email is readonly attribute from the foreign User entity.
I used the following Profile's entity mapping for getting User's email:
#Entity
public class Profile {
#Id
private String userId;
#PrimaryKeyJoinColumn
private User user;
#Basic
private String firstName;
#Basic
private String lastName;
// ...
public String getEmail() {
return user.getEmail();
}
}
So i decided to join the entire entity and delegate the work to it.
As far as i understand it is impossible to use #JoinColumn in couple with #Column like this:
#OneToOne
#JoinColumn(name = "userId", insertable = false, updatable = false)
#Column(name = "email")
private String email;
I am also not sure about using of #SecondaryTable as it seems that it is designed for a different purpose.
Is there any better approach for getting foreign entity field using JPA mappings?
JPA Backend: EclipseLink 2.6.2
That's not really what JPA was designed to do. Getting the email by just calling user.getEmail() is the cleanest option you have.
You shouldn't be worried too much about loading the entire user; the way I see it it's a single join, and JPA should do it for you. The performance impact should be minimal. (you can simple not expose the internal user object to not impact your object design too much. When using JPA, you're always limiting your OO design options though).
If you were using hibernate, the story would be different. Then you could use the #Formula annotation. It would not be more performant though. Eclipselink has nothing like it.