In our application we mainly use Spring #Transactional annotations together with JpaTemplate and Hibernate for database interaction. One part of the project, however, communicates with a different database, and hence we cannot use #Transactional here as a different transactionManager is required.
Here, for database updates, we explicitly create transactions in code, using PlatformTransactionManager, and either commit them or roll back after the call to JpaTemplate.execute(JpaCallback).
I have noticed a few places in this area of our code where we are not doing this for reads, however, and simply call JpaTemplate.execute(JpaCallback) without wrapping in a transaction. I am wondering what the dangers are of this. I appreciate that a transaction must be being created, as no database query can be run without a transaction, but since nothing in our code attempts to commit, will something potentially be holding on to resources?
Related
I have a class annotated with JPA #Entity annotation, so objects of this class persist in database and are managed using Hibernate ORM. In my class constructor, a connection to a MQTT broker is created, so each object during initialization stablish a TCP connection.
When a object data is fetch from the database, this constructor can not be used by the ORM, as ORM uses default constructor without arguments, so I put the code that stablish the connection in a #PostLoad annotated method.
The problem is, everytime the web application page is refreshed the ORM is asked to get the object, and the #Postload method is executed so the TCP connection is stablished again... but I want the connection to be stablished only the first time object is fetch from database, and no everytime page is refreshed.
So the solution would be a ORM with in memory object cache. This way the first time object is loaded from database the #Postload method is called, but next times ORM is asked to retrieve the object it is retrieved from cache.
I dont know if this is possible with Hibernate, I have been playing with cache options and #Cacheable annotation but it seems that #Postload method is called everytime I use the findById method of Repository class, no matter the cache options I set. So I guess Hibernate cache is caching table rows, no objects in memory.
You could use an entity manager with an extended persistence context which spans multiple transactions, but I have no idea how or if Spring Data supports this. This way, the entity would not be reloaded from the database but be part of this extended persistence context. Note though, that this comes with other issues.
Usually, such expensive operations are simply not done in entities. You could move the logic out of the class, or do some kind of connection pooling/caching to avoid reconnects. I don't know why you need a dedicated connection, but connecting to message brokers is usually done differently. Usually, such connections are pooled by some context object of the library or maybe Spring offers some integration with options for pooling. In Java/Jakarta EE, this is usually done through resource adapters. I bet there is a JMS implementation for MQTT that you could use, probably also with Spring. AFAIK, in Spring Data JPA, you usually fire a domain/application event and react to that somewhere else. In that listener you could publish a message to a topic/queue through JMS or the native MQTT library.
I need to audit changes to some entities in our application and am thinking of using JaVers. I like the support for interrogating the audit data provided by JaVers. Hibernate Envers looks good, but it stores data in the same DB.
Here are my requirements:
async logging - for minimal performance impact
store audit data in a different db - performance reasons as well
As far as I can see JaVers is not designed for the above, but seems possible to adapt to achieve the above. Here's how:
JaVers actually allows data to be stored in a different DB. You can provide a connection to any DB really. It's not how it's intended, but it works. Code below (note connectionProvider which can provide a connection to any DB):
'
final Connection dbConnection =
DriverManager.getConnection("jdbc:mysql://localhost:3306/javers", "root", "root");
ConnectionProvider connectionProvider = new ConnectionProvider() {
#Override
public Connection getConnection() {
//suitable only for testing!
return dbConnection;
}
};
JaversSqlRepository sqlRepository = SqlRepositoryBuilder
.sqlRepository()
.withConnectionProvider(connectionProvider)
.withDialect(DialectName.MYSQL).build();
The async can be achieved by moving the execution of the JaVers commit into a thread/executor. The challenge with that is that if the execution takes too long, it could be that the object changes before it's logged. There are 2 solutions I can think of here:
we could create a snapshot of the object (e.g. serialize it to JSON or the like) and pass that to a Thread to log it.
we provide our custom implementation of Javers Repository which processes the differences in the current thread, and then passes the Snapshot objects to be persisted in another thread. This way we'd only do reading from DB in the application thread, and do writing (which is generally more costly performance wise) in the Auditing thread.
QUESTIONS:
am I missing anything here? Could this work?
Does JaVers have support to create a snapshot of the object which then can be moved to another thread. It does it internally somewhere, so maybe it's something we could use.
JUST FYI: Not relevant for the question, but here are some other challenges I can think of and how I'm planning to solve them:
due to not doing audits in the same transaction, as if the transaction fails, it'd make audit rollback complex. So we need to audit only objects that were successfully committed. I intend to do that by using a Hibernate Interceptor, listening to the afterTransactionCompletion and only committing objects updated by that transaction.
In case of lazy loaded objects, I could see how, if we're trying to access them once the transaction is finished, it might be that the lazy loaded props can't be accessed (as the session might be closed too) - don't know how to fix this, but it might not be an issue as I think we're loading eager most props.
Interesting question.
First the démenti. All JaVers core modules are designed to decouple audit data from application data. As you mentioned, user provides a ConnectionProvider to be used by JaVers. It could be any database you want.
What are not designed to use with multiple DB are Spring integration modules for SQL, so javers-spring-jpa and javers-spring-boot-starter-sql. They just cover most common scenario so the same DB for application and JaVers.
You are right about lack of async commit. Fortunately, it can be implemented only in JaversCore without changing the Repositories.
The API could be:
CompletableFuture<Commit> javers.commitAsync(..., Executor);
First, Javers will take a snapshot of user's objects, it's fast so it can be done in the current thread.
Then, DB reads (loading latest snapshots) and DB writes (inserting new snapshots) can be done asynchronously (submitted to the given Executor).
As you mentioned, it requires the new approach to DB transactions. We plan to implement the Commit Withdrawal feature, so the app would be able to withdraw JaVers' commit after main DB rollback. See https://github.com/javers/javers/issues/588
I have a web application that uses Spring NamedJDBCTemplate, and all the calls to database are select statements.
In this case should i use #Transactional in my service class that calls the DAO class that inturn fires select statements to DB.
According to Transaction statergies listing 10 suggests to not use #Transactional for reads. Will i be bringing an overhead by using #Transactional and also i dont want to miss the AOP advises that i can bring in for #Transactional in future.
Yes, you should always access the database from inside a transaction. Not doing it will in fact create a transaction for every select statements.
Transactions aren't just useful for atomicity of updates. They also provide isolation guarantees. For example, (depending on the isolation level) reading the same row twice in a single transaction can return you the same data, and thus make sure you don't have incoherences in the read data. Doing it with multiple transactions won't provide any such guarantee.
I think the best way is to use #Transactional and set it as Supported not Required for such service
in this way if your service call outside a transaction it will not start a Transaction and if it calls from other service that is Transactional and already start a transaction it will participate in that transaction.
for example think that one service required to call two services in first one some data will inserted or updated and other one is just a select that return those data if these two service don't participate in single transaction the second service will not return data because the transaction start in calling service not committed yet .
We have a somewhat huge application which started a decade ago and is still under active development. So some parts are still in J2EE 1.4 architecture, others using Java EE 5/6.
While testing some new code, I realized that I had data inconsistency between information coming in through old and new code parts, where the old one uses the Hibernate session directly and the new one an injected EntityManager. This led to the problem, that one part couldn't see new data from the other part and thus also created a database record, resulting in primary key constraint violation.
It is planned to migrate the old code completely to get rid of J2EE, but in the meantime - what can I do to coordinate database access between the two parts? And shouldn't at some point within the application server both ways come together in the Hibernate layer, regardless if accessed via JPA or directly?
You can mix both Hibernate Session and Entity Manager in the same application without any problem. The EntityManagerImpl simply delegates calls the a private SessionImpl instance.
What you describe is a Transaction configuration anomaly. Every database transaction runs in isolation (unless you use REAN_UNCOMMITED which I guess it's not the case), but once you commit it the changes are available from any other transaction or connection. So once a transaction is committed you should see al changes in any other Hibernate Session, JDBC connection or even your database UI manager tool.
You said that there was a primary key conflict. This can't happen if you use Hibernate identity or sequence generator. For the old hi-lo generator you can have problems if an external connection tries to insert records in the same table Hibernate uses an old hi/lo identifier generator.
This problem can also occur if there is a master/master replication anomaly. If you have multiple nodes and there is no strict consistency replication you can end up with primar key constraint violations.
Update
Solution 1:
When coordinating the new and the old code trying to insert the same entity, you could have a slect-than-insert logic running in a SERIALIZABLE transaction. The SERIALIZABLE transaction acquires the appropriate locks on tour behalf and so you can still have a default READ_COMMITTED isolation level, while only the problematic Service methods are marked as SERIALIZABLE.
So both the old code and the new code have this logic running a select for checking if there is already a row satisfying the select constraint, only to insert it if nothing is found. The SERIALIZABLE isolation level prevents phantom reads so I think it should prevent constraint violations.
Solution 2:
If you are open to delegate this task to JDBC, you might also investigate the MERGE SQL statement, if your current database supports it. Basically, this is an upsert operation issuing an update or an insert behind the scenes. This command is much more attractive since you can still run it with even on READ_COMMITTED. The only drawback is that you can't use Hibernate for it, and only some databases support it.
If you instanciate separately a SessionFactory for the old code and an EntityManagerFactory for new code, that can lead to different value in first level cache. If during a single Http request, you change a value in old code, but do not immediately commit, the value will be changed in session cache, but it will not be available for new code until it is commited. Independentely of any transaction or database locking that would protect persistent values, that mix of two different Hibernate session can give weird things for in memory values.
I admit that the injected EntityManager still uses Hibernate. IMHO the most robust solution is to get the EntityManagerFactory for the PersistenceUnit and cast it to an Hibernate EntityManagerFactoryImpl. Then you can directly access the the underlying SessionFactory :
SessionFactory sessionFactory = entityManagerFactory.getSessionFactory();
You can then safely use this SessionFactory in your old code, because now it is unique in your application and shared between old and new code.
You still have to deal with the problem of session creation-close and transaction management. I suppose it is allready implemented in old code. Without knowing more, I think that you should port it to JPA, because I am pretty sure that if an EntityManager exists, sessionFactory.getCurrentSession() will give its underlying Session but I cannot affirm anything for the opposite.
I've run into a similar problem when I had a list of enumerated lookup values, where two pieces of code would check for the existence of a given value in the list, and if it didn't exist the code would create a new entry in the database. When both of them came across the same non-existent value, they'd both try to create a new one and one would have its transaction rolled back (throwing away a bunch of other work we'd done in the transaction).
Our solution was to create those lookup values in a separate transaction that committed immediately; if that transaction succeeded, then we knew we could use that object, and if it failed, then we knew we simply needed to perform a get to retrieve the one saved by another process. Once we had a lookup object that we knew was safe to use in our session, we could happily do the rest of the DB modifications without risking the transaction being rolled back.
It's hard to know from your description whether your data model would lend itself to a similar approach, where you'd at least commit the initial version of the entity right away, and then once you're sure you're working with a persistent object you could do the rest of the DB modifications that you knew you needed to do. But if you can find a way to make that work, it would avoid the need to share the Session between the different pieces of code (and would work even if the old and new code were running in separate JVMs).
I'm running a test within a subclass of AbstractTransactionalTestNGSpringContextTests, where I execute tests over a partial Spring context. Each test runs within a transaction, which is rolled back at the end to leave the database unchanged.
One test writes to the database through Hibernate, while another reads from the same database using the JdbcTemplate, with both share the same datasource.
I'm finding that I can't see the hibernate updates when querying through the JdbcTemplate. This does make some sense, as each is presumably getting its own connection from the connection pool and so is operating within its own transaction.
I've seen indications that it's possible to get the two to share a connection & transaction, but am not clear of the best way to set this up, especially with the involvement of the connection factory. All these components are declared as Spring beans. Can anyone give me any pointers?
Edit:
Well I've gone to the trouble of actually reading some documentation and the HibernateTransactionManager class states that this is definitely possible: "This transaction manager is appropriate for applications that use a single Hibernate SessionFactory for transactional data access, but it also supports direct DataSource access within a transaction (i.e. plain JDBC code working with the same DataSource)...".
The only requirement appears to be setting the datasource property, which isn't otherwise required. Having done that, though, I'm still not seeing my changes shared before the transaction has been committed. I will update if I get it working.
The Spring bean that writes has to flush its changes in order for the bean that reads to see them.