What are the benefit of #TableGenerator Technique to generate the primary keys?
Why we use this technique and how to fetch the data using third table that use to store the sequence name and value of generator?
From the link. http://en.wikibooks.org/wiki/Java_Persistence/Identity_and_Sequencing#Table_sequencing
There are several strategies for generating unique ids. Some strategies are database agnostic and others make use of built-in databases support.
JPA provides support for several strategies for id generation defined through the GenerationType enum values: TABLE, SEQUENCE and IDENTITY.
The choice of which sequence strategy to use is important as it affects performance, concurrency and portability.
So the choice of using table generators frees you from using database specific features. This makes it easy to migrate the database to some other db provider later on.
So the decision should be made based on whether you want to later on migrate database providers, how much performance you will sacrifice for that etc.
Related
Do you set foreign keys in your tables although you use an ORM framework like Hibernate or Doctrine? In my opinion the advantage is that you can navigate in the sql admin view easier but I think data integrity is no argument anymore - just because you can set cascade settings in the ORM framework annotations / xml / ...? What do you think?
In my case I will always use foreign keys because it gives me a clean and stable database definition. but I am interested on others opinions.
No existing ORM (even most powerfull) can be compared to enterprise-ready relational DBMS like mssql, postgres, mysql etc in maintaining data integrity and consistency (feature-wise, perfomance-wise). And your DB managed by RDMBS - is free and extra entry and extensibility point (with many available APIs) for you data. Hence it should keep data consistent while data manipulations, performed by various clients. In many cases you can think of your non-db app as off replacable front-end. Good side effects of being data-centric - you can use your db for low-level synchronisation of your app (app-parts) operations, change your app behavior easily (without client recompilation, redeployment) by changing DB-programmability, schema (including relations, constraints) etc.
We are currently in a project with a high demand on performance when it comes to reads from the database.
We are currently using JPA (EclipseLink implementation), currently just because it provides convenient database access and column mapping.
For our queries we are using highly specific SQL queries. We are also using one database (SAP HANA, in-memory), so a language abstraction is not required. The database access is pretty fast, our current bottleneck really is the application server, especially the persistence layer.
The result sets often also do not contain entities because entities are made up of the context. For us, there is no point in using an #Id field like the following, because we don't have fields that are unique (only combinations, but defining an IdClass is too much overhead).
#Entity
public class Item {
#Id
public myField;
// other fields...
}
This seems to be enforced by JPA if I want to run a typed native query. Is that assumption true? Currently we haven't found a way around the ID mapping.
Are these findings valid?
If not, how can we make our use of JPA more performant (there is significant latency compared to plain JDBC), also without defining an #Id (because it is useless in our case) for result types?
If yes, is there another Java library that just provides a minimum layer on top of JDBC without too much latency that provides a more convenient use than plain JDBC (with column mapping and all that good stuff).
Thanks!
Usecase: We would like to stream historic GPS sensor data from the database. Besides just transforming this to JSON, we also do some transformations/validations. That's why we actually need to build objects. So what we basically looking for is a convenient way of mapping the fields of select statements to attributes. I hope that makes sense.
There are many articles and blogs about improving EclipseLink/JPA performance that you might look into, such as EclipseLink Performance, JPA Performance Tuning and Optimizing the EclipseLink Application
In the end though it all depends very much on your specific use case and any future use cases you may want. JPA is designed to make reading and writing overtop of JDBC easier and more maintainable and adds many performance benefits such as caching. If all you are using it for is to read raw data though, the extra layer might be extra overhead that isn't adding any value. There isn't much point to having JPA build you entities from the resultsets, maintain the cache and watch for changes only for your application to ignore it all and grab the raw data.
I do not understand why you would have an Item table with a single myField. How is it used by the application and how does it relate to other tables and potential entities?
Such a construct is not the normal use case for relational databases and ORMs, but there are still ways around it in JPA. The data could be used in element collections by other entities, or even just not mapped, and native SQL queries used which are passed straight through the JDBC layer. EclipseLink itself has many mapping types and options above and beyond JPA that might be used depending on your use cases.
We have a number of object that have an id of type Long and are stored in MySql and use JPA/Hibernate for ORM. We are going to move some to Mongo in the future. Is it sensible to create an embeddable class for the Id field, e.g. ContentId and use this throughout the system in place of Long so that when we move to MongoDB or anothe noSql database without Long ids that we only have to change the internal representation of the ContentId class. I can only find references to using #EmbeddedId for composite keys. Is this a sensible thing to do? I don't want to have to go through all the code in a year or so when we change and replace Long with ObjectId.
MongoDB uses a generated OID as the default Id. You can also define your own using the _id attribute. The OID is basically a UUID, which maps best to a String. I would just use a UUID in MySQL, so you can use the same model on either. MongoDB does not support a composite id, so using a composite id is probably not a good idea.
EclipseLink supports JPA on both MySQL and MongoDB. EclipseLink also supports a #UuidGenerator that works with any database.
http://java-persistence-performance.blogspot.com/2012/04/eclipselink-jpa-supports-mongodb.html
http://wiki.eclipse.org/EclipseLink/UserGuide/JPA/Advanced_JPA_Development/NoSQL
I don't see what EmbeddedId would give you to gain portability .... best to focus on the value generators available and what the datastore would support, and look for how you can have something mappable on both datastores to ease the migration.
DataNucleus JPA obviously supports persistence to MongoDB and has for some time, allowing the full range of identities, whether it is the native MongoDB UUID ("identity" in JPA parlance), String-based (uuid, uuid-hex) or numeric ("table"). This gives portability and you can choose what suits your model best. It also supports persistence to many other types of datastores (RDBMS, Excel, ODF, ODBMS, HBase, AppEngine, LDAP, and others) should you need portability to other datastores too.
Why most hibernate application are using sequence for id generation?
Why not use the default GenerationType=AUTO in #GeneratedValue annotation?
P.S. In my professional career I see everybody is use sequences, but I don't understand why they bother with harder to deploy solution (there is always sequence create SQL command in deployment instructions).
I see several reasons:
The most used database in enterprise apps is probably Oracle, and Oracle doesn't have auto-generated IDs, but sequences.
Sequences allows having the ID before inserting a new row, rather than after inserting the new row. This is easier to use and more efficient because you can batch insert statements at the end of the transaction but still have IDs definned in the middle of the transaction.
Sequences allow using hilo algorithms (which is the default with the hibernate sequence generation), and thus make only one DB call for several inserts, thus increasing performance.
AUTO varies between databases, whereas sequence always uses the same strategy.
From the excellent book Pro JPA 2 Mastering Java Persistence API by Mike Keith and Merrick Schincario.
From Chapter 4: Object Relational Mapping, section Identifier Generation.
[...] If an application does not care what
kind of generation is used by the
provider but wants generation to
occur, it can specify a strategy of
AUTO.
There is a catch to using AUTO,
though. The provider gets to pick its
own strategy to store the identifiers,
but it needs to have some kind of
persistent resource in order to do so.
For example, if it chooses a
table-based strategy, it needs to
create a table; if it chooses a
sequence-based strategy, it needs to
create a sequence. The provider can’t
always rely on the database connection
that it obtains from the server to
have permissions to create a table in
the database. This is normally a
privileged operation that is often
restricted to the DBA. There will need
to be some kind of creation phase or
schema generation to cause the
resource to be created before the AUTO
strategy is able to function.
The AUTO mode is really a generation
strategy for development or
prototyping. It works well as a means
of getting you up and running more
quickly when the database schema is
being generated. In any other
situation, it would be better to use
one of the other generation strategies
discussed in the later sections [...]
At least for Oracle: one reason is to be able to track the number of objects in a table (for which the table-specific sequence is good, if no objects are deleted from the table). Using GenerationType=AUTO uses a global sequence number, which results in gaps in id numbers when having more than one table in the database.
There are different considerations for choosing identity generator, the most important ones are performance and portability but also clustering and data migration might be a consideration.
In practice in the latest Hibernate versions (if not all of them) the SEQUENCE strategy is actually a sequence based HiLo and not a pure sequence as must people assume.
You can read a pretty details post regarding identity generation strategies at my blog: here
Eyal
My question is regarding ORM and JDBC technologies, on what criteria would you decide to go for an ORM technology as compared to JDBC and other way round ?
Thanks.
JDBC
With JDBC, developer has to write code to map an object model's data representation to a relational data model and its corresponding database schema.
With JDBC, the automatic mapping of Java objects with database tables and vice versa conversion is to be taken care of by the developer manually with lines of code.
JDBC supports only native Structured Query Language (SQL). Developer has to find out the efficient way to access database, i.e. to select effective query from a number of queries to perform same task.
Application using JDBC to handle persistent data (database tables) having database specific code in large amount. The code written to map table data to application objects and vice versa is actually to map table fields to object properties. As table changed or database changed then it’s essential to change object structure as well as to change code written to map table-to-object/object-to-table.
With JDBC, it is developer’s responsibility to handle JDBC result set and convert it to Java objects through code to use this persistent data in application. So with JDBC, mapping between Java objects and database tables is done manually.
With JDBC, caching is maintained by hand-coding.
In JDBC there is no check that always every user has updated data. This check has to be added by the developer.
HIBERNATE.
Hibernate is flexible and powerful ORM solution to map Java classes to database tables. Hibernate itself takes care of this mapping using XML files so developer does not need to write code for this.
Hibernate provides transparent persistence and developer does not need to write code explicitly to map database tables tuples to application objects during interaction with RDBMS.
Hibernate provides a powerful query language Hibernate Query Language (independent from type of database) that is expressed in a familiar SQL like syntax and includes full support for polymorphic queries. Hibernate also supports native SQL statements. It also selects an effective way to perform a database manipulation task for an application.
Hibernate provides this mapping itself. The actual mapping between tables and application objects is done in XML files. If there is change in Database or in any table then the only need to change XML file properties.
Hibernate reduces lines of code by maintaining object-table mapping itself and returns result to application in form of Java objects. It relieves programmer from manual handling of persistent data, hence reducing the development time and maintenance cost.
Hibernate, with Transparent Persistence, cache is set to application work space. Relational tuples are moved to this cache as a result of query. It improves performance if client application reads same data many times for same write. Automatic Transparent Persistence allows the developer to concentrate more on business logic rather than this application code.
Hibernate enables developer to define version type field to application, due to this defined field Hibernate updates version field of database table every time relational tuple is updated in form of Java class object to that table. So if two users retrieve same tuple and then modify it and one user save this modified tuple to database, version is automatically updated for this tuple by Hibernate. When other user tries to save updated tuple to database then it does not allow saving it because this user does not have updated data.
Complexity.
ORM If your application is domain driven and the relationships among objects is complex or you need to have this object defining what the app does.
JDBC/SQL If your application is simple enough as to just present data directly from the database or the relationships between them is simple enough.
The book "Patterns of enterprise application architecture" by Martin Fowler explains much better the differences between these two types:
See: Domain Model and Transaction Script
I think you forgot to look at "Functional Relational Mapping"
I would sum up by saying:
If you want to focus on the data-structures, use an ORM like JPA/Hibernate
If you want to shed light on treatments, take a look at FRM libraries: QueryDSL or Jooq
If you need to tune your SQL requests to specific databases, use JDBC and native SQL requests
The strengh of various "Relational Mapping" technologies is portability: you ensure your application will run on most of the ACID databases.
Otherwise, you will cope with differences between various SQL dialects when you write manually the SQL requests.
Of course you can restrain yourself to the SQL92 standard (and then do some Functional Programming) or you can reuse some concepts of functionnal programming with ORM frameworks
The ORM strenghs are built over a session object which can act as a bottleneck:
it manages the lifecycle of the objects as long as the underlying database transaction is running.
it maintains a one-to-one mapping between your java objects and your database rows (and use an internal cache to avoid duplicate objects).
it automatically detects association updates and the orphan objects to delete
it handles concurrenty issues with optimistic or pessimist lock.
Nevertheless, its strengths are also its weaknesses:
The session must be able to compare objects so you need to implements equals/hashCode methods
But Objects equality must be rooted on "Business Keys" and not database id (new transient objects have no database ID!).
However, some reified concepts have no business equality (an operation for instance).
A common workaround relies on GUIDs which tend to upset database administrators.
The session must spy relationship changes but its mapping rules push the use of collections unsuitable for the business algorithms.
Sometime your would like to use an HashMap but the ORM will require the key to be another "Rich Domain Object" instead of another light one...
Then you have to implement object equality on the rich domain object acting as a key...
But you can't because this object has no counterpart on the business world.
So you fall back to a simple list that you have to iterate on (and performance issues result from)
The ORM API are sometimes unsuitable for real-world use.
For instance, real world web applications try to enforce session isolation by adding some "WHERE" clauses when you fetch data...
Then the "Session.get(id)" doesn't suffice and you need to turn to more complex DSL (HSQL, Criteria API) or go back to native SQL
The database objects conflicts with other objects dedicated to other frameworks (like OXM frameworks = Object/XML Mapping).
For instance, if your REST services use jackson library to serialize a business object.
But this Jackson exactly maps to an Hibernate One.
Then either you merge both and a strong coupling between your API and your database appears
Or you must implement a translation and all the code you saved from the ORM is lost there...
On the other side, FRM is a trade-off between "Object Relational Mapping" (ORM) and native SQL queries (with JDBC)
The best way to explain differences between FRM and ORM consists into adopting a DDD approach.
Object Relational Mapping empowers the use of "Rich Domain Object" which are Java classes whose states are mutable during the database transaction
Functional Relational Mapping relies on "Poor Domain Objects" which are immutable (so much so you have to clone a new one each time you want to alter its content)
It releases the constraints put on the ORM session and relies most of time on a DSL over the SQL (so portability doesn't matter)
But on the other hand, you have to look into the transaction details, the concurrency issues
List<Person> persons = queryFactory.selectFrom(person)
.where(
person.firstName.eq("John"),
person.lastName.eq("Doe"))
.fetch();
It also depends on the learning curve.
Ebean ORM has a pretty low learning curve (simple API, simple query language) if you are happy enough with JPA annotations for mapping (#Entity, #Table, #OneToMany etc).