I want to design a system. There are different customers using this system. I need to create the duplicated tables for every customer. For example, I have a table Order, then all of order records for customerA are in table Order_A, as well as customerB data are in table Order_B. I can distinct different customers from session, but how can I let Spring JPA to reflect the RDS table data to Java object?
I know 2 solutions, but both are not satisfied.
Consider to use Mybatis because it supports load SQL from xml file and parameters inside SQL;
Consider to use org.hibernate.EmptyInterceptor. This is my current implement in my project. For every entity, I must define a subclass of it. It can update the SQL before Hibernate's execution.
However, both are not graceful. I prefer the better solution.
I am reading data from a table using Spring JPA.
This Entity object has one-to-many relationship to other six tables.
All tables together has 20,000 records in them.
I am using below query to fetch data from DB.
SELECT * FROM A WHER ID IN (SELECT ID FROM B WHERE COL1 = '?')
A table has relationship to other 6 tables.
Spring JPA is taking around 30 seconds of time to read this data from DB.
Any idea to improve the data fetch time here.
I am using native Queries here and i am looking for query rewriting that will optimize the data fetch time.
Please suggest thanks.
You might need consider below to identify the root cause:
Check if you are ending up with n+1 query issue. Your query might end up calling n queries for each join table, where n is no. of associations with the join table. You can check this by setting spring.jpa.show-sql=true
If you see the issue as n+1 then you need set appropriate FetchMode, refer https://www.baeldung.com/hibernate-fetchmode for detailed explanation of using different FetchModes.
If it is not n+1 query issue you might need to check the performance of the genarated queries using EXPLAIN command. Usually IN clause on a non indexed columns have performance impact.
So set spring.jpa.show-sql=true and check queries generated and run to debug and optimize your code or query.
I have problems with full loading of very complex object from DB in a reasonable time and with reasonable count of queries.
My object has a lot of embedded entities, each entity has references to another entities, another entities references yet another and so on (So, the nesting level is 6)
So, I've created example to demonstrate what I want:
https://github.com/gladorange/hibernate-lazy-loading
I have User.
User has #OneToMany collections of favorite Oranges,Apples,Grapevines and Peaches. Each Grapevine has #OneToMany collection of Grapes. Each fruit is another entity with just one String field.
I'm creating user with 30 favorite fruits of each type and each grapevine has 10 grapes. So, totally I have 421 entity in DB - 30*4 fruits, 100*30 grapes and one user.
And what I want: I want to load them using no more than 6 SQL queries.
And each query shouldn't produce big result set (big is a result set with more that 200 records for that example).
My ideal solution will be the following:
6 requests. First request returns information about user and size of result set is 1.
Second request return information about Apples for this user and size of result set is 30.
Third, Fourth and Fifth requests returns the same, as second (with result set size = 30) but for Grapevines, Oranges and Peaches.
Sixth request returns Grape for ALL grapevines
This is very simple in SQL world, but I can't achieve such with JPA (Hibernate).
I tried following approaches:
Use fetch join, like from User u join fetch u.oranges .... This is awful. The result set is 30*30*30*30 and execution time is 10 seconds. Number of requests = 3. I tried it without grapes, with grapes you will get x10 size of result set.
Just use lazy loading. This is the best result in this example (with #Fetch=
SUBSELECT for grapes). But in that case that I need to manually iterate over each collection of elements. Also, subselect fetch is too global setting, so I would like to have something which could work on query level. Result set and time near ideal. 6 queries and 43 ms.
Loading with entity graph. The same as fetch join but it also make request for every grape to get it grapevine. However, result time is better (6 seconds), but still awful. Number of requests > 30.
I tried to cheat JPA with "manual" loading of entities in separate query. Like:
SELECT u FROM User where id=1;
SELECT a FROM Apple where a.user_id=1;
This is a little bit worse that lazy loading, since it requires two queries for each collection: first query to manual loading of entities (I have full control over this query, including loading associated entities), second query to lazy-load the same entities by Hibernate itself (This is executed automatically by Hibernate)
Execution time is 52, number of queries = 10 (1 for user, 1 for grape, 4*2 for each fruit collection)
Actually, "manual" solution in combination with SUBSELECT fetch allows me to use "simple" fetch joins to load necessary entities in one query (like #OneToOne entities) So I'm going to use it. But I don't like that I have to perform two queries to load collection.
Any suggestions?
I usually cover 99% of such use cases by using batch fetching for both entities and collections. If you process the fetched entities in the same transaction/session in which you read them, then there is nothing additionally that you need to do, just navigate to the associations needed by the processing logic and the generated queries will be very optimal. If you want to return the fetched entities as detached, then you initialize the associations manually:
User user = entityManager.find(User.class, userId);
Hibernate.initialize(user.getOranges());
Hibernate.initialize(user.getApples());
Hibernate.initialize(user.getGrapevines());
Hibernate.initialize(user.getPeaches());
user.getGrapevines().forEach(grapevine -> Hibernate.initialize(grapevine.getGrapes()));
Note that the last command will not actually execute a query for each grapevine, as multiple grapes collections (up to the specified #BatchSize) are initialized when you initialize the first one. You simply iterate all of them to make sure all are initialized.
This technique resembles your manual approach but is more efficient (queries are not repeated for each collection), and is more readable and maintainable in my opinion (you just call Hibernate.initialize instead of manually writing the same query that Hibernate generates automatically).
I'm going to suggest yet another option on how to lazily fetch collections of Grapes in Grapevine:
#OneToMany
#BatchSize(size = 30)
private List<Grape> grapes = new ArrayList<>();
Instead of doing a sub-select this one would use in (?, ?, etc) to fetch many collections of Grapes at once. Instead ? Grapevine IDs will be passed. This is opposed to querying 1 List<Grape> collection at a time.
That's just yet another technique to your arsenal.
I do not quite understand your demands here. It seems to me you want Hibernate to do something that it's not designed to do, and when it can't, you want a hack-solution that is far from optimal. Why not loosen the restrictions and get something that works? Why do you even have these restrictions in the first place?
Some general pointers:
When using Hibernate/JPA, you do not control the queries. You are not supposed to either (with a few exceptions). How many queries, the order they are executed in, etc, is pretty much beyond your control. If you want complete control of your queries, just skip JPA and use JDBC instead (Spring JDBC for instance.)
Understanding lazy-loading is key to making decisions in these type of situation. Lazy-loaded relations are not fetched when getting the owning entity, instead Hibernate goes back to the database and gets them when they are actually used. Which means that lazy-loading pays off if you don't use the attribute every time, but has a penalty the times you actually use it. (Fetch join is used for eager-fetching a lazy relation. Not really meant for use with regular load from the database.)
Query optimalization using Hibernate should not be your first line of action. Always start with your database. Is it modelled correctly, with primary keys and foreign keys, normal forms, etc? Do you have search indexes on proper places (typically on foreign keys)?
Testing for performance on a very limited dataset probably won't give the best results. There probably will be overhead with connections, etc, that will be larger than the time spent actually running the queries. Also, there might be random hickups that cost a few milliseconds, which will give a result that might be misleading.
Small tip from looking at your code: Never provide setters for collections in entities. If actually invoked within a transaction, Hibernate will throw an exception.
tryManualLoading probably does more than you think. First, it fetches the user (with lazy loading), then it fetches each of the fruits, then it fetches the fruits again through lazy-loading. (Unless Hibernate understands that the queries will be the same as when lazy loading.)
You don't actually have to loop through the entire collection in order to initiate lazy-loading. You can do this user.getOranges().size(), or Hibernate.initialize(user.getOranges()). For the grapevine you would have to iterate to initialize all the grapes though.
With proper database design, and lazy-loading in the correct places, there shouldn't be a need for anything other than:
em.find(User.class, userId);
And then maybe a join fetch query if a lazy load takes a lot of time.
In my experience, the most important factor for speeding up Hibernate is search indexes in the database.
I'm writing this on the fly on my phone, so forgive the crappy code samples.
I have entities with a manytomany relationship:
#JoinTable(name="foo", #JoinColum="...", #InverseJoinColumn="...")
#ManyToMany
List list = new ArrayList();
I want their data to be retrieved in a paginated way.
I know about setFirstResult and setMaxResults. Is there a way to use this with the mapping? As in, I retrieve the object and get the list filled with contents equal to the amount of records for a single page, with the appropriate offset.
I guess I'm just unclear of the best way to do this. I could just manually use hibernate criteria to have the effect, but I feel thats missing the API. I have this mapping, I want to see if there's a way to use it in a paginated way.
PS. If this is impractical, just say. Also, if it is, can I still use the mapping to add new entries to the join table. As in, if the entity is a persisted entity in the DB, but I haven't fetched the manytomany list, can I add something new to it and when its persisted with cascade all it'll be added to the join table without clearing the other entries?
The type of the relationship between entities that are part of your query isn't that important. There are a couple of ways to tackle this.
If your database supports the LIMIT keyword in it's queries, you would be able to use it to get data sets, assuming you sort your data. Note that if your data changes while your user is navigating between pages, you might see some duplication or miss some records. You'll be stuck having to rewrite if your database changes to one that doesn't have the LIMIT keyword.
If you need to freeze the data at the point of the original query you need to use a 3rd party framework or write your own to fetch a list of Ids for your query then split up that list and fetch by id in a subset for pagination. This is more reliable can be made to work for any database.
Displaytag is a data paging framework I've used and that I therefore can tell you works well for large datasets. It's also one of the older solutions for this problem and is not part of an extended framework.
http://displaytag.sourceforge.net/11/tut_externalSortAndPage.html
Table sorter is another one I came across. This one uses JQuery and fetches the entire data set in one query, so strictly speaking it doesn't meet your "fetches the data in a paginated way" criteria. (This might not be appropriate for large sets).
http://tablesorter.com/docs/
This tutorial might be helpful:
http://theopentutorials.com/examples/java-ee/jsp/pagination-in-servlet-and-jsp/
If you're already using a framework take a look at whether that framework has tackled pagination:
Spring MVC provides a data pager
http://blog.fawnanddoug.com/2012/05/pagination-with-spring-mvc-spring-data.html
GWT provides a data pager:
http://www.gwtproject.org/javadoc/latest/com/google/gwt/user/cellview/client/SimplePager.html
The following refrences might be helpful too:
JDBC Pagination
which also points to:
http://java.avdiel.com/Tutorials/JDBCPaging.html
Let's say I have about half of million records in table A. Of course table A is joined with table B, C, etc. Now I have to fetch entities from table A which meet my criteria. My criteria consists of about 20-30 rules e.g. name in table A has to be like 'something' or date from table B for joined record from table A with ID=1 should be earlier than today. I see three solutions:
Write native query and put parameters from my criteria. But in this case I will join so many tables that it doesn't seem to me as a good example.
Fetch all records from table A and then check in Java every rule for each record. But this seems for me as the worst possible solution.
Use JPA Criteria. But to be honest I do not know how efficient they are while there are so many records and so many joined tables. What's more it seems to me like working with Criteria can be a little irritating when I have so many rules to match.
Maybe there is another (better) solution to my problem but I cannot see it now. I need to add that I need these fetched entities stored in Java collection because when they are all fetched then I have to work with them (e.g. generate report or create some updates in DB basing on these information).
I hope I described my problem clear and I will be thankful for every tip how to optimize such query.