I'm making the switch away from ORMs in Java and I was wondering what was the best way of dealing with many-to-one and many-to-one relationships in a non-ORM setting.
In my Customer.java class I have:
private Long id;
private String name;
private Date dob;
//About 10 more fields
private List<Pet> pets;
In Pet.java I have:
private String id;
private String name;
private Customer owner;
My database table for Pet looks like this
id BIGSERIAL PRIMARY KEY,
name VARCHAR(20),
owner_id BIGSERIAL REFERENCES...
Now I realize that if I run a query that joins the two tables, I get a "flat" data structure returned which contains the fields for both Customer and Pet as well as any foreign keys. What is the common/most efficient way to treat the data in this scenario?
a. Rebuild the object graph manually by calling customer.setName(resultSet.getString(("name"))...?
b. Use the returned data as is by converting it to a Map<String, Object>?
The data flow is: Data is read from the database -> rendered to JSON for use by an AngularJS front end -> modified data is sent back to the server for validation -> domain logic applied -> saved to database.
If you want to read both Customer and Pet in a single query for better performance, you can do something like this:
List<Customer> customers = new ArrayList<>();
String sql = "SELECT c.id AS cust_id" +
", c.name AS cust_name" +
", c.dob AS cust_dob" +
", p.id AS pet_id" +
", p.name AS pet_name" +
" FROM Customer c" +
" LEFT JOIN Pet p ON p.owner_id = c.id" +
" WHERE c.name LIKE ?" +
" ORDER BY c.id";
try (PreparedStatement stmt = conn.prepareStatement(sql)) {
stmt.setString(1, "%DOE%");
try (ResultSet rs = stmt.executeQuery()) {
Customer customer = null;
while (rs.next()) {
long cust_id = rs.getLong("cust_id");
if (customer == null || customer.getId() != cust_id) {
customer = new Customer();
customer.setId(cust_id);
customer.setName(rs.getString("cust_name"));
customer.setDob(rs.getDate("cust_dob"));
customers.add(customer);
}
long pet_id = rs.getLong("pet_id");
if (pet_id != 0) {
Pet pet = new Pet();
pet.setId(pet_id);
pet.setName(rs.getString("pet_name"));
pet.setOwner(customer);
customer.addPet(pet);
}
}
}
}
The best option at this time are:
Spring JDBC (it has convenience of ORM like bean to object mapping etc.)
iBatis (allows to write SQL queries manually although it is ORM but a thin layer)
Write your own DAO layer implementation.
In all these cases you write your own sql queries and mostly they will result in join queries. By the way the example you have given are not nested objects.
Related
I have these entities (is an example because i cant share real name entities):
#Entity
public class User { #Id private BigDecimal id; private String name, private Color favouriteColor }
#Entity
public class Color { #Id private Long colorId; private String colorName;}
In the table I have this data:
USER
ID|NAME|FavColor
1 |John| 1
2 |Sarah| 2
3 |Mike| 1
COLOR
1|Red
2|Blue
Now I want make a query that recover all my user data without select Color entity, only its ids.
#Query("new myDto(u.iduser,u.username,u.favcolor) from user u where favcolor in :listcolors")
This makes me an query of the two tables, I want a unique query because i dont need color entities, only the ids.
--
Other option that I am testing is making a implementation of a nativequery like this:
final List<MyDTO> result = new ArrayList<>();
Query q = entityManager.createNativeQuery("SELECT " +
" USER_ID, " +
" USER_NAME, " +
" FAV_COLOR " + +
"FROM USER " +
"WHERE FAV_COLOR IN (?)");
q.setParameter(1, colors.toString().replace("[","").replace("]",""));
Long TRUE = new Long(1L);
final List<Object[]> resultList = q.getResultList();
for (Object[] objects : resultList) {
MyDTOdto = new MyDTO();
dto.userId(((((BigDecimal) objects[0]) != null) ? ((BigDecimal) objects[0]).longValue() : null));
dto.userName(((((String) objects[0]) != null) ? ((String) objects[0]).longValue() : null));
dto.favColor(((((BigDecimal) objects[0]) != null) ? ((BigDecimal) objects[0]).longValue() : null));
result.add(dto);
}
return result;
In this case, I am getting error code (ORA-1722 - Number Not valid). I don't know what I can test now. Some ideas? Thanks
I am guessing you have issues with the SQL generated and your use of the inner join: when you call "u.favcolor" in the select clause, you are telling JPA to perform an inner join from User to Color based on the favcolor relationship. As favcolor is a Color reference, you are going to get the full color row, where as your native query implies you just want the foreign key value. If all you want is the fk/ID value from Color, the query should be:
"SELECT new myDto(u.iduser, u.username, color.id) FROM user u join u.favcolor color WHERE color.id in :listcolors"
This still might perform an inner join from user to color, but it should be in a single statement.
If you want to ensure you avoid the join:
Use EclipseLink's COLUMN JPQL extension to access the foreign key column directly. Something like:
"SELECT new myDto(u.iduser, u.username, COLUMN('FAV_COLOR', u) FROM user u WHERE COLUMN('FAV_COLOR', u) in :listcolors"
Use EclipseLink native query key functionality to access the "FAV_COLOR" foreign key column in the USER table directly for your JPQL queries. This requires a descriptor customizer to access, but allows you to use the foreign key value in JPQL queries directly without having to map it, and without the COLUMN mechanism tying your JPQL queries to a particular database table detail. This would allow a query of the form:
"SELECT new myDto(u.iduser, u.username, u.favColorVal FROM user u join u.favcolor color WHERE u.favColorVal in :listcolors"
Just map the FAV_COLOR as a basic mapping, in addition to the existing favColor reference mapping (or replacing it if you want):
#Basic
#Column(name="FAV_COLOR", updatable=false, insertable=false)
BigDecimal favColorId
This then allows you to use query "SELECT new myDto(u.iduser, u.username, u.favColorId FROM user u join u.favColorId color WHERE u.favColorId in :listcolors" to the same effect, but you can also just return the User instance (marking favColor as lazy and not serializable) as it will have the same data anyway.
In the same code that I used in this question, I would like to obtain the "composite ManyToMany POJO"s with it's "related objects list" sorted by a number column in the junction table.
My DAO's SQL STATEMENT
#Query("SELECT * FROM teacher " +
"INNER JOIN teacherscourses AS tc ON teacher.t_id = tc.teacher_id " +
"INNER JOIN course AS c ON c.c_id = tc.course_id " +
"WHERE tc.course_id = :course_id " +
"ORDER BY teacher.t_id ASC, tc.course_order ASC"
)
public abstract List<TeacherWithCourses> getTeachersByCourseId(short course_id);
The generated DAO's method obtains the list of TeacherWithCourses objects as expected. These objects' courses list property gets the related objects as expected; but the "ORDER BY" clause doesn't seem to affect how these courses list is sorted at all.
I expected each internal list in these objects (List<Course> courses) to be sorted according to the junction table's tc.course_order number; but the obtained objects seem to come sorted arbitrarily.
The "composite" POJO
public class TeacherWithCourses implements Serializable {
#Embedded public Teacher teacher;
#Relation(
parentColumn = "t_id",
entity = Course.class,
entityColumn = "c_id",
associateBy = #Junction(
value = TeachersCourses.class,
parentColumn = "teacher_id",
entityColumn = "course_id"
)
)
public List<Course> courses;
}
Is this kind of sorting possible through SQL query, or must I implement it some other way?
[EDIT] The Junction Entity
I store the ordering criteria inside an extra column here:
#Entity(primaryKeys = {"teacher_id", "course_id"})
public class TeachersCourses implements Serializable {
#ColumnInfo(name = "teacher_id")
public int teacherId;
#ColumnInfo(name = "course_id")
public short courseId;
#ColumnInfo(index = true, name = "course_order")
public short courseOrder;
}
but the obtained objects seem to come sorted arbitrarily.
This is because of how #Relation works. It does not include/consider the child objects from the query, it uses the query just to extract and build the parent. It then gets, irrespective of the supplied query, ALL of the children for the parent via a query that is build according to the attributes of the #Relation.
As such to filter and or order the children you need to supplement #Relation by overwriting the list/array of children.
For convenience based upon my previous answer (so using AltCourse with changed column names) here's an example of one way that is easy to implement (but not the most efficient).
First the additional #Dao components :-
#Query("SELECT altcourse.* FROM altcourse " +
"JOIN teacherscourses ON teacherscourses.course_id = altcourse.courseid " +
"JOIN teacher ON teacher.id = teacherscourses.teacher_id " +
"WHERE teacher.id=:id ORDER BY teacherscourses.course_order ASC")
public abstract List<AltCourse> getSortedCoursesForATeacher(int id);
#Transaction
public List<AltTeacherWithCourses> getAltTeachersByCourseIdSorted(short course_id) {
List<AltTeacherWithCourses> rv = getAltTeachersByCourseId(course_id);
for (AltTeacherWithCourses twc: rv) {
twc.courses = getSortedCoursesForATeacher(twc.teacher.id);
}
return rv;
}
obviously change anything starting with Alt accordingly
So a query that extracts the courses in the correct order, and
A method that gets the original teachersWithCourses where the Courses are not sorted as expected and replaces the list of courses with the sorted courses as extracted by the additional query.
Alt version have been used in the above for testing so with the tables looking like :-
and
Running :-
for(Course c: dao.getAllCourses()) {
for (AltTeacherWithCourses tbc: dao.getAltTeachersByCourseIdSorted(c.id)) {
Log.d("TEACHER_SORTC","Teacher is " + tbc.teacher.name + " Courses = " + tbc.courses.size());
for(AltCourse course: tbc.courses) {
Log.d("COURSE_SORTC","\tCourse is " + course.coursename);
}
}
}
Results in :-
2021-11-11 15:26:01.559 D/TEACHER_SORTC: Teacher is Teacher1 Courses = 3
2021-11-11 15:26:01.559 D/COURSE_SORTC: Course is AltCourse3
2021-11-11 15:26:01.559 D/COURSE_SORTC: Course is AltCourse2
2021-11-11 15:26:01.559 D/COURSE_SORTC: Course is AltCourse1
2021-11-11 15:26:01.565 D/TEACHER_SORTC: Teacher is Teacher1 Courses = 3
2021-11-11 15:26:01.565 D/COURSE_SORTC: Course is AltCourse3
2021-11-11 15:26:01.565 D/COURSE_SORTC: Course is AltCourse2
2021-11-11 15:26:01.565 D/COURSE_SORTC: Course is AltCourse1
2021-11-11 15:26:01.568 D/TEACHER_SORTC: Teacher is Teacher1 Courses = 3
2021-11-11 15:26:01.568 D/COURSE_SORTC: Course is AltCourse3
2021-11-11 15:26:01.569 D/COURSE_SORTC: Course is AltCourse2
2021-11-11 15:26:01.569 D/COURSE_SORTC: Course is AltCourse1
2021-11-11 15:26:01.569 D/TEACHER_SORTC: Teacher is Teacher2 Courses = 1
2021-11-11 15:26:01.569 D/COURSE_SORTC: Course is AltCourse3
Additional
An alternative/more efficient (SQLite wise) approach is to use a more complex query along with a POJO that uses #Embedded for Teacher and Course.
Thus you extract objects that have a Teacher and a Course and then you can build the TeachersWithCourses from the extract.
This does not need additional queries as are run when using #Relation and thus there is no need for an #Transaction.
note the example is using the original Teacher and Course without unique names so would have to be modified accordingly.
First the TeacherCourse POJO :-
class TeacherCourse {
#Embedded
Teacher teacher;
#Embedded(prefix = "course")
Course course;
}
prefix used to circumvent duplicate column names.
The query :-
#Query("WITH teacher_in_course AS (" +
"SELECT teacher.id " +
"FROM teacher " +
"JOIN teacherscourses ON teacher.id = teacherscourses.teacher_id " +
"WHERE course_id = :courseId" +
")" +
"SELECT course.id AS courseid, course.name as coursename, teacher.* " +
"FROM course " +
"JOIN teacherscourses ON course.id = teacherscourses.course_id " +
"JOIN teacher ON teacher.id = teacherscourses.teacher_id " +
"WHERE teacher.id IN (SELECT * FROM teacher_in_course) " +
"ORDER BY teacher.id ASC, course_order ASC" +
";")
public abstract List<TeacherCourse> getTeachersCoursesSortedFromCourseId(short courseId);
the CTE (Common Table Expression) teacher_in_course retrieves the list of teachers who have the specified courseid. This is used in the actual query to get the teachers and ALL courses for each of the teachers ordered accordingly. As such there is no need for #Transaction as all the data is extracted in the single query.
However, a TeachersWithCourses list needs to be built from the list of TeacherCourse objects e.g. :-
public List<TeacherWithCourses> buildTeacherWithCoursesListFromTeacherCourseList(List<TeacherCourse> teacherCourseList) {
ArrayList<TeacherWithCourses> rv = new ArrayList<>();
boolean afterFirst = false;
TeacherWithCourses currentTWC = new TeacherWithCourses();
currentTWC.teacher = new Teacher(-1,"");
ArrayList<Course> currentCourseList = new ArrayList<>();
currentTWC.courses = currentCourseList;
for (TeacherCourse tc: teacherCourseList) {
if(currentTWC.teacher.id != tc.teacher.id) {
if (afterFirst) {
currentTWC.courses = currentCourseList;
rv.add(currentTWC);
currentCourseList = new ArrayList<>();
currentTWC = new TeacherWithCourses();
}
currentTWC.teacher = tc.teacher;
currentTWC.courses = new ArrayList<>();
}
currentCourseList.add(tc.course);
afterFirst = true;
}
if (afterFirst) {
currentTWC.courses = currentCourseList;
rv.add(currentTWC);
}
return rv;
}
From what it looks like you have two options.
Here is an article explaining both methods https://newbedev.com/using-room-s-relation-with-order-by
You can sort your List after its been pulled from the database.
Using Collections.Sort. This will remove a lot of complexity from your query. And is probably the best option.
or
You can split your query into 3 parts and manually create a List of TeacherWithCourses, a query pulls a List of Teacher and then uses each teacher to pull their Courses in the order you desire. This is more queries but can be down under a single #Transaction so there will be little felt repercussions on the database side. But be careful this can create a lot of objects fast.
I have an issue with mapping retrieved data via JDBi3 using PostgreSQL query in my DAO interface.
In my Dropwizard app I have Book DTO class which is has Many-To-Many relation with Author and Category DTO classes and have a problem with mapping queried rows onto BookDTO class. Here are the code snippets of DTO classes:
class BookDTO {
private Long bookId;
// other fields are left for code brevity
private List<Long> authors;
private List<Long> categories;
// empty constructor + constructor with all fields excluding Lists + getters + setters
}
class AuthorDTO {
private Long authorId;
// other fields are left for code brevity
private List<Long> books;
// empty constructor + constructor with all fields excluding List + getters + setters
}
class CategoryDTO {
private Long categoryId;
// other fields are left for code brevity
private List<Long> books;
// empty constructor + constructor with all fields excluding List + getters + setters
}
...and since I am using JDBi3 DAO interfaces for performing CRUD operations this is how my method for querying all books in database looks like:
#Transaction
#UseRowMapper(BookDTOACMapper.class)
#SqlQuery("SELECT book.book_id AS b_id, book.title, book.price, book.amount, book.is_deleted, author.author_id AS aut_id, category.category_id AS cat_id FROM book " +
"LEFT JOIN author_book ON book.book_id = author_book.book_id " +
"LEFT JOIN author ON author_book.author_id = author.author_id " +
"LEFT JOIN category_book ON book.book_id = category_book.book_id " +
"LEFT JOIN category ON category_book.category_id = category.category_id ORDER BY b_id ASC, aut_id ASC, cat_id ASC")
List<BookDTO> getAllBooks();
...and this is map method of BookDTOACMapper class look like:
public class BookDTOACMapper implements RowMapper<BookDTO> {
#Override
public BookDTO map(ResultSet rs, StatementContext ctx) throws SQLException {
final long bookId = rs.getLong("b_id");
// normally retrieving values by using appropriate rs.getXXX() methods
Set<Long> authorIds = new HashSet<>();
Set<Long> categoryIds = new HashSet<>();
long authorId = rs.getLong("aut_id");
if (authorId > 0) {
authorIds.add(authorId);
}
long categoryId = rs.getLong("cat_id");
if (categoryId > 0) {
categoryIds.add(categoryId);
}
while (rs.next()) {
if (rs.getLong("b_id") != bookId) {
break;
} else {
authorId = rs.getLong("aut_id");
if (authorId > 0) { authorIds.add(authorId); }
categoryId = rs.getLong("cat_id");
if (categoryId > 0) { categoryIds.add(categoryId); }
}
}
final List<Long> authorIdsList = new ArrayList<>(authorIds);
final List<Long> categoryIdsList = new ArrayList<>(categoryIds);
return new BookDTO(bookId, title, price, amount, is_deleted, authorIdsList, categoryIdsList);
}
}
Problem I encounter is that when invoking my GET method (defined in Resource class which invokes getAllBooks() method from BookDAO class) displays inconsistent results while the query itself returns proper results.
Many questions that I've managed to find on Stackoverflow, official JDBi3 Docs API and Google Groups are considering One-To-Many relationship and using #UseRowReducer annotation which contains class which impelements LinkedHashMapRowReducer<TypeOfEntityIdentifier, EntityName> but for this case I could not find a way to implement it. Any example/suggestion is welcome. :)
Thank you in advance.
Versions of used tools:
Dropwizard framework 1.3.8
PostgreSQL 11.7
Java8
This will be too long for a comment:
This is basically a debugging question. Why?
while (rs.next()) {
if (rs.getLong("b_id") != bookId) {
break;
} else {
The firstif after the while is eating the row after the current (the one that wass current when the row mapper is called). You are skipping the processing there (putting the data in the Java objects) for the bookId, authorId, etc. That's why you get
inconsistent results while the query itself returns proper results.
So you need to revisit how you process the data. I see two paths:
Revisit the logic of the processing loop to store the data when stopping the processing for given bookId. It is possible to achieve this with scrollable ResultSets - i.e. request a scrollable ResultSet and before the brake; call rs.previous(). On the next call to the row mapper the processing will start from the correct line in the result set.
Use the power of the SQL/PostgreSQL and do it properly: https://dba.stackexchange.com/questions/173831/convert-right-side-of-join-of-many-to-many-into-array Aggregate and shape the data in the database. The database is the best tool for this job.
Also take your time and check the other answers of https://dba.stackexchange.com/users/3684/erwin-brandstetter. They give invaluable insights in the SQL and PostgreSQL.
As zloster mentioned in his answer I've chosen 2nd option (by this answer for Many-To-Many relationships) which was to use edit my PostgreSQL query #SqlQuery annotation above List<BookDTO> getAllBooks(); method. Query now uses array_agg aggregate function in SELECT statement to group my results in an ARRAY and now looks like this:
#UseRowMapper(BookDTOACMapper.class)
#SqlQuery("SELECT b.book_id AS b_id, b.title, b.price, b.amount, b.is_deleted, ARRAY_AGG(aut.author_id) as aut_ids, ARRAY_AGG(cat.category_id) as cat_ids " +
"FROM book b " +
"LEFT JOIN author_book ON author_book.book_id = b.book_id " +
"LEFT JOIN author aut ON aut.author_id = author_book.author_id " +
"LEFT JOIN category_book ON category_book.book_id = b.book_id " +
"LEFT JOIN category cat ON cat.category_id = category_book.category_id " +
"GROUP BY b_id " +
"ORDER BY b_id ASC")
List<BookDTO> getAllBooks();
Therefore map(..) method of BookDTOACMapper class had to be edited and now looks like this:
#Override
public BookDTO map(ResultSet rs, StatementContext ctx) throws SQLException {
final long bookId = rs.getLong("b_id");
String title = rs.getString("title");
double price = rs.getDouble("price");
int amount = rs.getInt("amount");
boolean is_deleted = rs.getBoolean("is_deleted");
Set<Long> authorIds = new HashSet<>();
Set<Long> categoryIds = new HashSet<>();
/* rs.getArray() retrives java.sql.Array and after it getArray gets
invoked which returns array of Object(s) which are being casted
into array of Long elements */
Long[] autIds = (Long[]) (rs.getArray("aut_ids").getArray());
Long[] catIds = (Long[]) (rs.getArray("cat_ids").getArray());
Collections.addAll(authorIds, autIds);
Collections.addAll(categoryIds, catIds);
final List<Long> authorIdsList = new ArrayList<>(authorIds);
final List<Long> categoryIdsList = new ArrayList<>(categoryIds);
return new BookDTO(bookId, title, price, amount, is_deleted, authorIdsList, categoryIdsList);
}
Now all results are consistent and here's a screenshot of query in pgAdmin4.
I'm getting a warning in the Server log "firstResult/maxResults specified with collection fetch; applying in memory!". However everything working fine. But I don't want this warning.
My code is
public employee find(int id) {
return (employee) getEntityManager().createQuery(QUERY).setParameter("id", id).getSingleResult();
}
My query is
QUERY = "from employee as emp left join fetch emp.salary left join fetch emp.department where emp.id = :id"
Although you are getting valid results, the SQL query fetches all data and it's not as efficient as it should.
So, you have two options.
Fixing the issue with two SQL queries that can fetch entities in read-write mode
The easiest way to fix this issue is to execute two queries:
. The first query will fetch the root entity identifiers matching the provided filtering criteria.
. The second query will use the previously extracted root entity identifiers to fetch the parent and the child entities.
This approach is very easy to implement and looks as follows:
List<Long> postIds = entityManager
.createQuery(
"select p.id " +
"from Post p " +
"where p.title like :titlePattern " +
"order by p.createdOn", Long.class)
.setParameter(
"titlePattern",
"High-Performance Java Persistence %"
)
.setMaxResults(5)
.getResultList();
List<Post> posts = entityManager
.createQuery(
"select distinct p " +
"from Post p " +
"left join fetch p.comments " +
"where p.id in (:postIds) " +
"order by p.createdOn", Post.class)
.setParameter("postIds", postIds)
.setHint(
"hibernate.query.passDistinctThrough",
false
)
.getResultList();
Fixing the issue with one SQL query that can only fetch entities in read-only mode
The second approach is to use SDENSE_RANK over the result set of parent and child entities that match our filtering criteria and restrict the output for the first N post entries only.
The SQL query can look as follows:
#NamedNativeQuery(
name = "PostWithCommentByRank",
query =
"SELECT * " +
"FROM ( " +
" SELECT *, dense_rank() OVER (ORDER BY \"p.created_on\", \"p.id\") rank " +
" FROM ( " +
" SELECT p.id AS \"p.id\", " +
" p.created_on AS \"p.created_on\", " +
" p.title AS \"p.title\", " +
" pc.id as \"pc.id\", " +
" pc.created_on AS \"pc.created_on\", " +
" pc.review AS \"pc.review\", " +
" pc.post_id AS \"pc.post_id\" " +
" FROM post p " +
" LEFT JOIN post_comment pc ON p.id = pc.post_id " +
" WHERE p.title LIKE :titlePattern " +
" ORDER BY p.created_on " +
" ) p_pc " +
") p_pc_r " +
"WHERE p_pc_r.rank <= :rank ",
resultSetMapping = "PostWithCommentByRankMapping"
)
#SqlResultSetMapping(
name = "PostWithCommentByRankMapping",
entities = {
#EntityResult(
entityClass = Post.class,
fields = {
#FieldResult(name = "id", column = "p.id"),
#FieldResult(name = "createdOn", column = "p.created_on"),
#FieldResult(name = "title", column = "p.title"),
}
),
#EntityResult(
entityClass = PostComment.class,
fields = {
#FieldResult(name = "id", column = "pc.id"),
#FieldResult(name = "createdOn", column = "pc.created_on"),
#FieldResult(name = "review", column = "pc.review"),
#FieldResult(name = "post", column = "pc.post_id"),
}
)
}
)
The #NamedNativeQuery fetches all Post entities matching the provided title along with their associated PostComment child entities. The DENSE_RANK Window Function is used to assign the rank for each Post and PostComment joined record so that we can later filter just the amount of Post records we are interested in fetching.
The SqlResultSetMapping provides the mapping between the SQL-level column aliases and the JPA entity properties that need to be populated.
Now, we can execute the PostWithCommentByRank #NamedNativeQuery like this:
List<Post> posts = entityManager
.createNamedQuery("PostWithCommentByRank")
.setParameter(
"titlePattern",
"High-Performance Java Persistence %"
)
.setParameter(
"rank",
5
)
.unwrap(NativeQuery.class)
.setResultTransformer(
new DistinctPostResultTransformer(entityManager)
)
.getResultList();
Now, by default, a native SQL query like the PostWithCommentByRank one would fetch the Post and the PostComment in the same JDBC row, so we will end up with an Object[] containing both entities.
However, we want to transform the tabular Object[] array into a tree of parent-child entities, and for this reason, we need to use the Hibernate ResultTransformer.
The DistinctPostResultTransformer looks as follows:
public class DistinctPostResultTransformer
extends BasicTransformerAdapter {
private final EntityManager entityManager;
public DistinctPostResultTransformer(
EntityManager entityManager) {
this.entityManager = entityManager;
}
#Override
public List transformList(
List list) {
Map<Serializable, Identifiable> identifiableMap =
new LinkedHashMap<>(list.size());
for (Object entityArray : list) {
if (Object[].class.isAssignableFrom(entityArray.getClass())) {
Post post = null;
PostComment comment = null;
Object[] tuples = (Object[]) entityArray;
for (Object tuple : tuples) {
if(tuple instanceof Identifiable) {
entityManager.detach(tuple);
if (tuple instanceof Post) {
post = (Post) tuple;
}
else if (tuple instanceof PostComment) {
comment = (PostComment) tuple;
}
else {
throw new UnsupportedOperationException(
"Tuple " + tuple.getClass() + " is not supported!"
);
}
}
}
if (post != null) {
if (!identifiableMap.containsKey(post.getId())) {
identifiableMap.put(post.getId(), post);
post.setComments(new ArrayList<>());
}
if (comment != null) {
post.addComment(comment);
}
}
}
}
return new ArrayList<>(identifiableMap.values());
}
}
The DistinctPostResultTransformer must detach the entities being fetched because we are overwriting the child collection and we don’t want that to be propagated as an entity state transition:
post.setComments(new ArrayList<>());
Reason for this warning is that when fetch join is used, order in result sets is defined only by ID of selected entity (and not by join fetched).
If this sorting in memory is causing problems, do not use firsResult/maxResults with JOIN FETCH.
To avoid this WARNING you have to change the call getSingleResult to
getResultList().get(0)
This warning tells you Hibernate is performing in memory java pagination. This can cause high JVM memory consumption.
Since a developer can miss this warning, I contributed to Hibernate by adding a flag allowing to throw an exception instead of logging the warning (https://hibernate.atlassian.net/browse/HHH-9965).
The flag is hibernate.query.fail_on_pagination_over_collection_fetch.
I recommend everyone to enable it.
The flag is defined in org.hibernate.cfg.AvailableSettings :
/**
* Raises an exception when in-memory pagination over collection fetch is about to be performed.
* Disabled by default. Set to true to enable.
*
* #since 5.2.13
*/
String FAIL_ON_PAGINATION_OVER_COLLECTION_FETCH = "hibernate.query.fail_on_pagination_over_collection_fetch";
the problem is you will get cartesian product doing JOIN. The offset will cut your recordset without looking if you are still on same root identity class
I guess the emp has many departments which is a One to Many relationship. Hibernate will fetch many rows for this query with fetched department records. So the order of result set can not be decided until it has really fetch the results to the memory. So the pagination will be done in memory.
If you do not want to fetch the departments with emp, but still want to do some query based on the department, you can achieve the result with out warning (without doing ordering in the memory). For that simply you have to remove the "fetch" clause. So something like as follows:
QUERY = "from employee as emp left join emp.salary sal left join emp.department dep where emp.id = :id and dep.name = 'testing' and sal.salary > 5000 "
As others pointed out, you should generally avoid using "JOIN FETCH" and firstResult/maxResults together.
If your query requires it, you can use .stream() to eliminate warning and avoid potential OOM exception.
try (Stream<ENTITY> stream = em.createQuery(QUERY).stream()) {
ENTITY first = stream.findFirst().orElse(null); // equivalents .getSingleResult()
}
// Stream returned is an IO stream that needs to be closed manually.
I want to insert a list of Objects in my db. In a special case I know that they primary key (not auto-generated) is not already there. Since I need to insert a big collection, the save(Iterable<Obj> objects) is to slow.
Therefore I consider using a native query. native insert query in hibernate + spring data
In the previous answer, it does not say how to insert a collection of objects. Is this possible?
#Query("insert into my_table (date, feature1, feature2, quantity) VALUES <I do not know what to add here>", nativeQuery = true)
void insert(List<Obj> objs);
Of course if you have a better solution overall, Its even better.
I ended up implementing my own repository. The performance of this is really good, 2s instead of 35s before to insert 50000 elements. The problem with this code is that it does not prevent sql injections.
I also tryed to build a query using setParameter(1, ...) but somehow JPA takes a long time to do that.
class ObjectRepositoryImpl implements DemandGroupSalesOfDayCustomRepository {
private static final int INSERT_BATCH_SIZE = 50000;
#Autowired
private EntityManager entityManager;
#Override
public void blindInsert(List<SomeObject> objects) {
partition(objects, INSERT_BATCH_SIZE).forEach(this::insertAll);
}
private void insertAll(List<SomeObject> objects) {
String values = objects.stream().map(this::renderSqlForObj).collect(joining(","));
String insertSQL = "INSERT INTO mytable (date, feature1, feature2, quantity) VALUES ";
entityManager.createNativeQuery(insertSQL + values).executeUpdate();
entityManager.flush();
entityManager.clear();
}
private String renderSqlForObj(Object obj) {
return "('" + obj.getDate() + "','" +
obj.getFeature1() + "','" +
obj.getFeature2() + "'," +
obj.getQuantity() + ")";
}
}