I need to use an Entity framework with my application, and I have used table - partitions in Oracle database. With simple JDBC, I am able to select data from a specific partition. But I don't know whether I can do the same with hibernate or Eclipse link (JPA). If someone knows how to do that, please do let me know.
usually the select statement in JDBC - SQL is,
select * from TABLE_NAME partiton(PARTITON_NAME) where FIELD_NAME='PARAMETER_VALUE';
How can I do the same with Hibernates or JPA?
Please share at least a link for learning sources.
Thanks!!!
JPA or any other ORM framework does not support Oracle partition tables natively (atleast in my knowledge).
There are different possible solutions though, depending on the nature of your problem:
Refactor your classes so that data that needs to be treated differently in real-life, belongs in a separate class. Sometimes this is called vertical partitioning (partitions are not obtained across rows, rather across columns).
Use Oracle partition tables underneath and use native SQL queries or stored procedures from JPA. This is just a possibile solution (I haven't attempted this).
Use Hibernate Shards. Although the typical use case for Hibernate Shards is not for a single database, it presents a singular view of distributed databases to an application developer.
Related:
JPA Performance, Don't Ignore the Database
EclipseLink supports partitioning or sharding with different options.
You can find more about this and examples here:
http://wiki.eclipse.org/EclipseLink/UserGuide/JPA/Advanced_JPA_Development/Data_Partitioning
Table partitioning is data organization on physical level. In a word, partitioning is a poor man index. Like the later, it is supposed to be entirely transparent to the user. A SQL query is allowed to refer to the entire table, but not partition. Then, it is query optimizer job to decide if it can leverage a certain partition, or index.
Related
I use Spring Data JPA (hibernate) generated queries for fetching data from my Sqlserver. Now i am getting performance related issues in my system.
Load findByLoadId(Integer loadId);
This is the query i am using to get data. This query returns 25 cell data but i only use 5 data from that.
can i use direct native query like
select id,date,createdBy,createdOn,loadName from Load where
loadId=:loadId
but if native query is suggestable then I am having question like Does ORM frameWork reduce performence by getting unneeded data from Database?
By "data cell" I assume that you are referring to database table columns, and not to records. The answer to your question is that yes, ORM frameworks might tend to just do a SELECT * under the hood, which can result in unwanted information being sent across the network to your application. If the JPA repository interface is behaving this way, you may switch to either an explicit JPA query (e.g. using the #Query annotation), or even a native query. Then, just select the columns you want. The issue here is that ORM frameworks map object templates (e.g. classes) to entire database tables. So, the concept of entity implicitly includes every database column. If you go with the option of selecting only certain columns, you may need to do some juggling on the Java side. Note that if the use a JPA query, your code would still, in theory, be database independent.
I have a requirement that the MySQL database being used in my application is scaling very aggressively. I am in no state currently to migrate to a NoSQL Database.
I have figured out the following areas where I can try splitting the current database into multiple databases:
There are some tables which have static content, i.e. it changes barely.
There are user tables which store the user data upon interaction which changes drastically.
Now, if i split the database into two different databases, how will I handle the transaction? How will I write the Data Access Layer, will i have connections to both the databases? The application currently uses Spring & Hibernate for Back End. There are calls which join the user tables and the content tables in the current schema.
The architecture follows the current structure:
Controller -> Service -> DAO Layer.
So, if i am willing to refactor the DAO layer which communicates with the database, what approach should i follow? I know only about Hibernate ORM but i would be willing to letting it go if there is something better than Hibernate.
Multiple databases on the same server? That approach will probably not improve performance on its own. RAM, fast disks, optimization, partitioning, and correct indexing will have a far greater payback.
If you have multiple databases on one server you can connect to them with a single connection, and simply use the database names with the table names in your SQL. Transactions work fine within a single connection.
Transactions across multiple connections and multiple servers are harder. There's a feature in MySQL called XA transactions to help handle this. But it has plenty of overhead, and is therefore most useful for high-value transactions as in banking.
In the jargon of the trade, adding servers is called "scale-out." The alternative is "scale-up," in which you add more RAM, faster direct-access storage, optimization, and other stuff to a single server to get it to do more.
There are several approaches you can take to the scale-out problem. The classic one is to use MySQL to set up a single primary server with multiple load-balanced replica servers.. That's probably the path that's most often taken, so you can do it without reinventing a lot of wheels. In this solution you do all your writing to a single instance. Queries that look up data can use multiple read-only load-balanced instances.
http://dev.mysql.com/doc/refman/5.5/en/replication-solutions-scaleout.html
This is a very popular approach where you have a mix of long-running reporting queries and short-running interactive queries. The reporting can be run on dedicated slave servers.
Another approach is multiple-primary-server replication using MySQL Cluster. https://dev.mysql.com/doc/refman/5.5/en/mysql-cluster-replication-multi-master.html
Another approach, if you have money to spend, is to go with a supported MySQL Cluster. Oracle, MariaDB, and Percona have such products on offer.
Scale-out is a big job no matter how you approach it. There's some documented experience from other people who have done it. For example, https://www.facebook.com/note.php?note_id=23844338919
It sounds like you did not thought about the partition of your database.
You should read something about database normalization first: database normalization
To split the database i would export the sql code from the database, then i would make 2 new files were i copy the tables that i want to have in the specific databases. After that i would import the 2 files in the specific databases.
i think this might help u help me: lets say i want to print reports for a user. the user is persisted in 'user' table and there is a score table which has the user score for every user_id. Now, my plan is to put the user table in one database, and score table in another database, making them two data sources. How can i handle such a scenario?
First to put the tables in different databases make no sence for me and i did not know if there is a ability to make select queries with to different databases mixed.
example: SELECT score, name FROM user, score WHERE score > 100 AND(score.user_id = user.user_id);
I dont no if this fit with two databases i think not.
I need to retrieve information from SQL Server from structure point of view
example: I need to list all database, all procedures, all parameters from procedures, all column name from tables, column type from table columns.
How would be the approach to do that?
thanks
There are (at least) three ways.
Use the DatabaseMetaData class. There's a lot in there, so for details see the Java Docs. I believe this is the preferred way, as it is standard across all database engines. So if you switch to a different database engine, it shoudl still work, and what you learn about how to use it will apply whether you are working on this db engine or another.
Do queries against the "information_schema". Basically this is a database with information about the database structure. It has tables listing all the tables, all the columns, etc. This is the second-best choice. The information_schema is more or less standard across databases. Unfortunately, the implementation for MS SQL Server is not the best.
Do queries against the "system tables" and/or "system views". There are a set of tables in MS SQL Server with information about the database structure. Like information_schema, but it's specific to MS SQL Server and thus includes metadata for things that are not part of standard SQL, like calculated fields and identity columns. It's also more complete and reliable than MS's implementation of information_schema.
For really general stuff, like what tables are in the DB or what columns are in a given table, information_schema and the system tables are equally good and useful. But if you need much of anything else, MS's implementation of the information_schema quickly proves inadequate. I've seen lots of comments on the Web encouraging developers to use the system tables rather than the information_schema. I don't like to use non-standard stuff, as both the code itself and the experience gained are not portable, but in this case, I think they're right: use the system tables.
But DatabaseMetaData is better yet, if it gives you what you need.
check out DatabaseMetaData Interface
I have to design a web application to retrieve data from a huge single table with 40 columns and several thousands of rows for select query and few rows/columns for updation.
Can you please suggest me that for faster performance, use of Hibernate is feasible or not as i only have single table and do not have any joins ?
Or should i use jdbc dao ?
database : sql server 2008
java 7
If you use Hibernate right, there's no problem in fetching an arbitrarily large result set. Just avoid from queries (use select ... from ... queries) and use ScrollableResults. If you use plain JDBC, you'll be able to get started quicker because Hibernate needs to be configured first, you need to write the mapping file, etc. but later on it might pay off since the code you write will be much simpler. Hibernate is very good at taking the boilerplate out of client code.
If you want to retrieve several thousand records and pagination is not possible then It might be a performance issue. Because hibernate will create an object against everyone and store it in its persistence context. If you create too many objects, it uses up a lot of memory. For these type of operations JDBC is better. For similar discussion see Hibernate performance issues using huge databases
We have a requirement to delete data in the range of 200K from database everyday. Our application is Java/Java EE based using Oracle DB and Hibernate ORM tool.
We explored various options like
Hibernate batch processing
Stored procedure
Database partitioning
Our DBA suggests database partitioning is the best way to go, so we can easily recreate and drop the partitioned table everyday. Now the issue is we have 2 kinds of data, one which we want to delete everyday and the other which we want to keep it. Suppose this data is stored in table "Trade". Now with partitioning, we have 2 tables "Trade". We have already existing Hibernate based DAO layer to fetch/store trades from/to DB. When we decide to partition the database, how can we control the trades to go in which of the two tables through hibernate. Basically I want , the trades need to be deleted by end of the day, to go in partitioned table and the trades I want to keep, in main table. Please suggest how can this be possible with Hibernate. We may add an additional column to identify the trades to be deleted but how can we ensure these trades should go to partitioned trade table using hibernate.
I would appreciate if someone can suggest any better approach in case we are on wrong path.
When we decide to partition the database, how can we control the trades to go in which of the two tables through hibernate.
That's what Hibernate Shards is for.
You could use hibernate inheritance strategy.
If you know at object creation that it will be deleted by the end of the day, you can create a VolatileTrade that is a subclass of Trade (with no other attribute). Use the 'table per concrete class' strategy (section 9.1.5 of hibernate 3.3 reference documentation) for the mapping.
(I think i would do an abstract superclass Trade, and two concrete subclasses : PersistentTrade and VolatileTrade, so that if you have some other classes that you know will reference only PersistentTrade (or Volatile), you can constrain that in your code. If you had used the Trade superclass as the PersistentTrade, you won't be able to enforce that.)
The volatile trade will go in one table and the 'persitent' trade will go in another table.
Be aware that you won't be able to set a fk constraint on any Trade (persistent and volatile) from other table in the db.
Then you just have to clear the table when you want.
Be careful to define a locking mechanism so that no other thread will try to write data to the table during the drop and the create (if you use that). That won't be an easy task, and doing it rightfully might impact the performance of all operation inserting data in the table (as it will require acquiring the lock).
Wouldn't it be more easy to truncate the table ?