The Ebean ORM is the go-to ORM for the Play! Java Framework.
As I am making the choice between building my own object relational mappers and other object relational behavioural patterns, and using an ORM, a criterion that strikes me as really important is whether or not Ebeans supports concurrent access to mappers.
Indeed albeit Play! uses asynchronous threading, there can still be a reader/writer problem that arises from concurrent requests using the same objects.
Hence the question is, do the Ebean ORM supports multithreading (reader/writer problem)?
Ebeans supports concurrent access to mappers
EbeanServer is safe for multiple concurrent use yes. The EbeanServer instance is built and contains all the meta data about the bean properties etc (so the mapping information).
EbeanServer internally holds some mutating data such as L2 cache, performance metrics for query execution, query execution plans etc but these are written to be thread safe.
In general Query objects and query results (object graphs) are not thread safe and intended for single threaded use. You can create read-only object graphs that can't be mutated and hence safe for multi-threaded use via query.setReadOnly(true).
EbeanServer also has support for background fetching via findFutureRowCount(), findFutureList() etc which internally makes a copy of the query and takes care of the details. findFutureRowCount() is used internally as part of PagedList to get total row count.
Related
We are currently in a project with a high demand on performance when it comes to reads from the database.
We are currently using JPA (EclipseLink implementation), currently just because it provides convenient database access and column mapping.
For our queries we are using highly specific SQL queries. We are also using one database (SAP HANA, in-memory), so a language abstraction is not required. The database access is pretty fast, our current bottleneck really is the application server, especially the persistence layer.
The result sets often also do not contain entities because entities are made up of the context. For us, there is no point in using an #Id field like the following, because we don't have fields that are unique (only combinations, but defining an IdClass is too much overhead).
#Entity
public class Item {
#Id
public myField;
// other fields...
}
This seems to be enforced by JPA if I want to run a typed native query. Is that assumption true? Currently we haven't found a way around the ID mapping.
Are these findings valid?
If not, how can we make our use of JPA more performant (there is significant latency compared to plain JDBC), also without defining an #Id (because it is useless in our case) for result types?
If yes, is there another Java library that just provides a minimum layer on top of JDBC without too much latency that provides a more convenient use than plain JDBC (with column mapping and all that good stuff).
Thanks!
Usecase: We would like to stream historic GPS sensor data from the database. Besides just transforming this to JSON, we also do some transformations/validations. That's why we actually need to build objects. So what we basically looking for is a convenient way of mapping the fields of select statements to attributes. I hope that makes sense.
There are many articles and blogs about improving EclipseLink/JPA performance that you might look into, such as EclipseLink Performance, JPA Performance Tuning and Optimizing the EclipseLink Application
In the end though it all depends very much on your specific use case and any future use cases you may want. JPA is designed to make reading and writing overtop of JDBC easier and more maintainable and adds many performance benefits such as caching. If all you are using it for is to read raw data though, the extra layer might be extra overhead that isn't adding any value. There isn't much point to having JPA build you entities from the resultsets, maintain the cache and watch for changes only for your application to ignore it all and grab the raw data.
I do not understand why you would have an Item table with a single myField. How is it used by the application and how does it relate to other tables and potential entities?
Such a construct is not the normal use case for relational databases and ORMs, but there are still ways around it in JPA. The data could be used in element collections by other entities, or even just not mapped, and native SQL queries used which are passed straight through the JDBC layer. EclipseLink itself has many mapping types and options above and beyond JPA that might be used depending on your use cases.
I have an application with hibernate. There are two main threads, first one is collecting/modifying data and second one is saving data into database, in certain cases the program may try to modify and save the entity at same time.
Do i have to make all entities thread-safe (use only synchronized collections, atomic objects instead of primitives...) or hibernate takes care about it automatically?
Hibernate instantiates objects per session, so classic synchronization is not needed (and would not be helpful).
The most common way to take care of concurrent data access and modifications is to use locks.
Is it ok to have a REST Webservice (Spring and Jersey) that uses a DAO with a ConcurrentHashMap to store the data, or should I avoid it and use some kind of in-memory DB?
It's an sample application, so I don't mind losing the data every time the application stops.
ConcurrentHashMap is fine if you pretty much just need to create, read, update and delete entities. I'm actually using ConcurrentHashMap in an application that runs in Jetty and emulates some system that our application integrates.
But, as Soitorios Delimanolis and omickron mentioned, things would get hairy if you need to to rely on the atomicity of transactions involving multiple database operations.
To safeguard myself from that situation, I defined interfaces for my DAOs and wrote ConcurrentHashMap-backed implementation. If time comes when that would not be sufficient, I'd be able to swap out that implementation with an implementation based on HSQLDB or SQLite.
You can use ConcurrentHashMap, but you will have some difficulties when:
trying to do 2 and more actions in same "transaction", you should synchronize such actions with other threads, as ConcurrentHashMap is successfully works only with one operation;
trying to search not by Map key but some other field of the Map.Entry.value Object.
ConcurrentHashMap is for other purposes.
So, I'll advise to use any in-memory DB.
I tend to use Hibernate in combination with Spring framework and it's declarative transaction demarcation capabilities (e.g., #Transactional).
As we all known, hibernate tries to be as non-invasive and as transparent as possible, however this proves a bit more challenging when employing lazy-loaded relationships.
I see a number of design alternatives with different levels of transparency.
Make relationships not lazy-loaded (e.g., fetchType=FetchType.EAGER)
This vioalites the entire idea of lazy loading ..
Initialize collections using Hibernate.initialize(proxyObj);
This implies relatively high-coupling to the DAO
Although we can define an interface with initialize, other implementations are not guaranteed to provide any equivalent.
Add transaction behaviour to the persistent Model objects themselves (using either dynamic proxy or #Transactional)
I've not tried the dynamic proxy approach, although I never seemed to get #Transactional working on the persistent objects themselves. Probably due to that hibernate is operation on a proxy to bein with.
Loss of control when transactions are actually taking place
Provide both lazy/non-lazy API, e.g, loadData() and loadDataWithDeps()
Forces the application to know when to employ which routine, again tight coupling
Method overflow, loadDataWithA(), ...., loadDataWithX()
Force lookup for dependencies, e.g., by only providing byId() operations
Requires alot of non-object oriented routines, e.g., findZzzById(zid), and then getYyyIds(zid) instead of z.getY()
It can be useful to fetch each object in a collection one-by-one if there's a large processing overhead between the transactions.
Make part of the application #Transactional instead of only the DAO
Possible considerations of nested transactions
Requires routines adapted for transaction management (e.g., suffiently small)
Small programmatic impact, although might result in large transactions
Provide the DAO with dynamic fetch profiles, e.g., loadData(id, fetchProfile);
Applications must know which profile to use when
AoP type of transactions, e.g., intercept operations and perform transactions when necessary
Requires byte-code manipulation or proxy usage
Loss of control when transactions are performed
Black magic, as always :)
Did I miss any option?
Which is your preferred approach when trying to minimize the impact of lazy-loaded relationships in your application design?
(Oh, and sorry for WoT)
As we all known, hibernate tries to be as non-invasive and as transparent as possible
I would say the initial assumption is wrong. Transaparent persistence is a myth, since application always should take care of entity lifecycle and of size of object graph being loaded.
Note that Hibernate can't read thoughts, therefore if you know that you need a particular set of dependencies for a particular operation, you need to express your intentions to Hibernate somehow.
From this point of view, solutions that express these intentions explicitly (namely, 2, 4 and 7) look reasonable and don't suffer from the lack of transparency.
I am not sure which problem (caused by lazyness) you're hinting to, but for me the biggest pain is to avoid losing session context in my own application caches. Typical case:
object foo is loaded and put into a map;
another thread takes this object from the map and calls foo.getBar() (something that was never called before and is lazy evaluated);
boom!
So, to address this we have a number of rules:
wrap sessions as transparently as possible (e.g. OpenSessionInViewFilter for webapps);
have common API for threads/thread pools where db session bind/unbind is done somewhere high in the hierarchy (wrapped in try/finally) so subclasses don't have to think about it;
when passing objects between threads, pass IDs instead of objects themselves. Receiving thread can load object if it needs to;
when caching objects, never cache objects but their ids. Have an abstract method in your DAO or manager class to load the object from 2nd level Hibernate cache when you know the ID. The cost of retrieving objects from 2nd level Hibernate cache is still far cheaper than going to DB.
This, as you can see, is indeed nowhere close to non-invasive and transparent. But the cost is still bearable, to compare with the price I'd have to pay for eager loading. The problem with latter is that sometimes it leads to the butterfly effect when loading single referenced object, let alone a collection of entities. Memory consumption, CPU usage and latency to mention the least are also far worse, so I guess I can live with it.
A very common pattern is to use OpenEntityManagerInViewFilter if you're building a web application.
If you're building a service, I would open the TX on the public method of the service, rather than on the DAOs, as very often a method requires to get or update several entities.
This will solve any "Lazy Load exception". If you need something more advanced for performance tuning, I think fetch profiles is the way to go.
My question is regarding ORM and JDBC technologies, on what criteria would you decide to go for an ORM technology as compared to JDBC and other way round ?
Thanks.
JDBC
With JDBC, developer has to write code to map an object model's data representation to a relational data model and its corresponding database schema.
With JDBC, the automatic mapping of Java objects with database tables and vice versa conversion is to be taken care of by the developer manually with lines of code.
JDBC supports only native Structured Query Language (SQL). Developer has to find out the efficient way to access database, i.e. to select effective query from a number of queries to perform same task.
Application using JDBC to handle persistent data (database tables) having database specific code in large amount. The code written to map table data to application objects and vice versa is actually to map table fields to object properties. As table changed or database changed then it’s essential to change object structure as well as to change code written to map table-to-object/object-to-table.
With JDBC, it is developer’s responsibility to handle JDBC result set and convert it to Java objects through code to use this persistent data in application. So with JDBC, mapping between Java objects and database tables is done manually.
With JDBC, caching is maintained by hand-coding.
In JDBC there is no check that always every user has updated data. This check has to be added by the developer.
HIBERNATE.
Hibernate is flexible and powerful ORM solution to map Java classes to database tables. Hibernate itself takes care of this mapping using XML files so developer does not need to write code for this.
Hibernate provides transparent persistence and developer does not need to write code explicitly to map database tables tuples to application objects during interaction with RDBMS.
Hibernate provides a powerful query language Hibernate Query Language (independent from type of database) that is expressed in a familiar SQL like syntax and includes full support for polymorphic queries. Hibernate also supports native SQL statements. It also selects an effective way to perform a database manipulation task for an application.
Hibernate provides this mapping itself. The actual mapping between tables and application objects is done in XML files. If there is change in Database or in any table then the only need to change XML file properties.
Hibernate reduces lines of code by maintaining object-table mapping itself and returns result to application in form of Java objects. It relieves programmer from manual handling of persistent data, hence reducing the development time and maintenance cost.
Hibernate, with Transparent Persistence, cache is set to application work space. Relational tuples are moved to this cache as a result of query. It improves performance if client application reads same data many times for same write. Automatic Transparent Persistence allows the developer to concentrate more on business logic rather than this application code.
Hibernate enables developer to define version type field to application, due to this defined field Hibernate updates version field of database table every time relational tuple is updated in form of Java class object to that table. So if two users retrieve same tuple and then modify it and one user save this modified tuple to database, version is automatically updated for this tuple by Hibernate. When other user tries to save updated tuple to database then it does not allow saving it because this user does not have updated data.
Complexity.
ORM If your application is domain driven and the relationships among objects is complex or you need to have this object defining what the app does.
JDBC/SQL If your application is simple enough as to just present data directly from the database or the relationships between them is simple enough.
The book "Patterns of enterprise application architecture" by Martin Fowler explains much better the differences between these two types:
See: Domain Model and Transaction Script
I think you forgot to look at "Functional Relational Mapping"
I would sum up by saying:
If you want to focus on the data-structures, use an ORM like JPA/Hibernate
If you want to shed light on treatments, take a look at FRM libraries: QueryDSL or Jooq
If you need to tune your SQL requests to specific databases, use JDBC and native SQL requests
The strengh of various "Relational Mapping" technologies is portability: you ensure your application will run on most of the ACID databases.
Otherwise, you will cope with differences between various SQL dialects when you write manually the SQL requests.
Of course you can restrain yourself to the SQL92 standard (and then do some Functional Programming) or you can reuse some concepts of functionnal programming with ORM frameworks
The ORM strenghs are built over a session object which can act as a bottleneck:
it manages the lifecycle of the objects as long as the underlying database transaction is running.
it maintains a one-to-one mapping between your java objects and your database rows (and use an internal cache to avoid duplicate objects).
it automatically detects association updates and the orphan objects to delete
it handles concurrenty issues with optimistic or pessimist lock.
Nevertheless, its strengths are also its weaknesses:
The session must be able to compare objects so you need to implements equals/hashCode methods
But Objects equality must be rooted on "Business Keys" and not database id (new transient objects have no database ID!).
However, some reified concepts have no business equality (an operation for instance).
A common workaround relies on GUIDs which tend to upset database administrators.
The session must spy relationship changes but its mapping rules push the use of collections unsuitable for the business algorithms.
Sometime your would like to use an HashMap but the ORM will require the key to be another "Rich Domain Object" instead of another light one...
Then you have to implement object equality on the rich domain object acting as a key...
But you can't because this object has no counterpart on the business world.
So you fall back to a simple list that you have to iterate on (and performance issues result from)
The ORM API are sometimes unsuitable for real-world use.
For instance, real world web applications try to enforce session isolation by adding some "WHERE" clauses when you fetch data...
Then the "Session.get(id)" doesn't suffice and you need to turn to more complex DSL (HSQL, Criteria API) or go back to native SQL
The database objects conflicts with other objects dedicated to other frameworks (like OXM frameworks = Object/XML Mapping).
For instance, if your REST services use jackson library to serialize a business object.
But this Jackson exactly maps to an Hibernate One.
Then either you merge both and a strong coupling between your API and your database appears
Or you must implement a translation and all the code you saved from the ORM is lost there...
On the other side, FRM is a trade-off between "Object Relational Mapping" (ORM) and native SQL queries (with JDBC)
The best way to explain differences between FRM and ORM consists into adopting a DDD approach.
Object Relational Mapping empowers the use of "Rich Domain Object" which are Java classes whose states are mutable during the database transaction
Functional Relational Mapping relies on "Poor Domain Objects" which are immutable (so much so you have to clone a new one each time you want to alter its content)
It releases the constraints put on the ORM session and relies most of time on a DSL over the SQL (so portability doesn't matter)
But on the other hand, you have to look into the transaction details, the concurrency issues
List<Person> persons = queryFactory.selectFrom(person)
.where(
person.firstName.eq("John"),
person.lastName.eq("Doe"))
.fetch();
It also depends on the learning curve.
Ebean ORM has a pretty low learning curve (simple API, simple query language) if you are happy enough with JPA annotations for mapping (#Entity, #Table, #OneToMany etc).