I have a MariaDB database with 2 schemas : Store1 and Store2. These schemas contain several tables. "Products" is one of these.
In my repository, I need to add the following request :
#Query("SELECT Store1.Products.Id, Store1.Products.Label FROM Store1.Products UNION SELECT Store2.Products.Id, Store2.Products.Label FROM Store2.Products")
public Page<Products> getAllProducts(Pageable pageable);
Of course, the query doesn't work, it's just for an example about what I want to do.
I've briefly read about multi-tenancy, but I don't know if it fits my use case, and seems complex for such a simple case.
Is there a simpler way to get this query ?
Related
I want to design a system. There are different customers using this system. I need to create the duplicated tables for every customer. For example, I have a table Order, then all of order records for customerA are in table Order_A, as well as customerB data are in table Order_B. I can distinct different customers from session, but how can I let Spring JPA to reflect the RDS table data to Java object?
I know 2 solutions, but both are not satisfied.
Consider to use Mybatis because it supports load SQL from xml file and parameters inside SQL;
Consider to use org.hibernate.EmptyInterceptor. This is my current implement in my project. For every entity, I must define a subclass of it. It can update the SQL before Hibernate's execution.
However, both are not graceful. I prefer the better solution.
Let's say I have about half of million records in table A. Of course table A is joined with table B, C, etc. Now I have to fetch entities from table A which meet my criteria. My criteria consists of about 20-30 rules e.g. name in table A has to be like 'something' or date from table B for joined record from table A with ID=1 should be earlier than today. I see three solutions:
Write native query and put parameters from my criteria. But in this case I will join so many tables that it doesn't seem to me as a good example.
Fetch all records from table A and then check in Java every rule for each record. But this seems for me as the worst possible solution.
Use JPA Criteria. But to be honest I do not know how efficient they are while there are so many records and so many joined tables. What's more it seems to me like working with Criteria can be a little irritating when I have so many rules to match.
Maybe there is another (better) solution to my problem but I cannot see it now. I need to add that I need these fetched entities stored in Java collection because when they are all fetched then I have to work with them (e.g. generate report or create some updates in DB basing on these information).
I hope I described my problem clear and I will be thankful for every tip how to optimize such query.
I am stuck with an issue. I have 3 tables that are associated with a table in one to many relationship.
An employee may have one or more degrees.
An employee may have one or more departments in past
An employee may have one or more Jobs
I am trying to fetch results using named query in a way that I fetch all the results from Degree table and Department table, but only 5 results from Jobs table. Because I want to apply pagination on Jobs table.
But, all these entities are in User tables as a set. Secondly, I don't want to change mapping file because of other usages of same files and due to some architectural restrictions.
Else in case of mapping I could use BatchSize annotation in mapping file, which I am not willing to do.
The best approach is to write three queries:
userRepository.getDegrees(userId);
userRepository.getDepartments(userId);
userRepository.getJobs(userId, pageIndex);
Spring Data is very useful for pagination, as well as simplifying your data access code.
Hibernate cannot fetch multiple Lists in a single query, and even for Sets, you don't want to run a Cartesian Product. So use queries instead of a single JPQL query.
I am looking for possible optimizations for framework-generated queries.
As far as I understand, the process is the following:
you could declare your domain objects as POJOs and adding several annotations like #Entity, #Table, #ManyToOneetc.
you declare your repositories e.g. per interfaces
With (2) you have several options to describe your query: e.g. per Methodnames or #Query
If I write a query like:
#Query("select t from Order t LEFT join fetch t.orderPositions where t.id = ?1")
Page<Order> findById(Pageable pageable, String id);
a SQL-query is autogenerated, where every column of the order is resolved and subsequentially for orderpositions and depending obejcts/tables.
As if I wrote:
select * from order
So in case, that I need some Information from several joined objects, a query could be quite expensive: and more interesting quite ineffective. I stumbled upon a slow query and MySQL-explain told me, that in the generated query the optimizer could not make use of indices, which is bad.
Of course (I am aware) I have to deal with a tradeoff, that generated SQL isn't as optimal as manually written and have the advantage of writing less boilerplate code.
My question is: what are good strategies to improve queries, queryexecution?
I have thought for some options by myself:
1) Is it possible to define several "Entities" for different purposes, like Order for access to the full characteristics of an order and something like FilteredOrder with fewer columns and no resolution of Join-columns? Both would reference the same tables, but one would use all of the columns and the other only some.
2) Use #Query(... native="true") with a selection of all columns, which I want to use. The advantage of that would be, that I would not double my domain-objects and litter my codebase with hundreds of Filtered-Objects.
What about paging? Is using pageable in combination with #Query( ...native="true") still possible (I am afraid not).
3) Last but in my eyes "worst"/boilerplate solution: Use JDBCTemplates and do stuff at a lower level.
Are there other options, of which I haven't thought?
Thank you for any inspiration on that topic :]
Update:
Our current strategy is the following
1) Where possible, I work with select new
As I have seen, this works for every Object (be it an Entity or POJO)
2) In combination with database views it is possible to take the best of SQL and ORM. For some usecases it might be of interest to have an aggregated resultset at hand. Defining this resultset as a view makes it easy from the db-perspective to watch the result with a simple select-statement.
For the ORM-side this means, you could easily define an entity matching this view and you get the whole ORM-goodness on top: Paging incl.
One solution is to use DTO's:
#Query("select new FilteredOrder(o.name, o.size, o.cost) from Order o where o.id = ?1")
Page<FilteredOrder> findFilteredOrderById(Pageable pageable, String id);
If you want to have entities for some reports generation maybe you should think about using nosql datastore?
Take a look at JPA's lazy fetching strategy. It will allow you to select objects without their relations, but will fetch the relations when you reference them.
Here are two queries retrieving equivalent data:
SELECT DISTINCT packStatus
FROM PackStatus packStatus JOIN FETCH packStatus.vars, ProcessEntity requestingProcess
WHERE
packStatus.status.code='OGVrquestExec'
AND packStatus.procId=requestingProcess.activityRequestersProcessId
AND requestingProcess.id='1000323733_GU_OGVProc'
SELECT DISTINCT packStatus
FROM PackStatus packStatus JOIN FETCH packStatus.vars
WHERE
packStatus.status.code='OGVrquestExec'
AND packStatus.procId=(SELECT requestingProcess.activityRequestersProcessId FROM ProcessEntity requestingProcess WHERE requestingProcess.id='1000323733_GU_OGVProc')
These queries differ in the method how requstingProcess is joined with packStatus. In general, which of these two methods is more preferable in terms of performance? I'm using JPA 1.2 provided by Hibernate 3.3 on Postgres 8.4.
UPD: I've replaced fake queries with real queries from my app. Here is SQL generated by Hibernate for first and second query. Links to query plans: first, second. Query plans look pretty the same. The only difference is what moment data from bpms_process table is aggregated to query result at. But I don't know is it right to generalize these results? Would query plans be almost the same for queries differing only in joining method? Is it possible to get a big difference in query cost by changing joining method?
Use EXPLAIN ANALYZE and see.
I won't be surprised if they get turned into the same query plan.
See: https://stackoverflow.com/tags/postgresql-performance/info