Using Hibernate and JDBC together from different threads

Using Hibernate and JDBC together from different threads - java

I want to use Spring-Hibernate and JDBC together in my application.
Hibernate should do all the updating and writing from one thread and other threads should just be able to read from the database without too much synchronization effort.
Will those JDBC-using threads deliver correct results (if they read from the database a short time after calling persist() or merge()) or could it happen, that Hibernate
has not flushed any updates and therefore other threads return wrong database entries?

"Wrong" depends on the isolation level you set for your connection pool.
I think it can work if Hibernate and Spring share the same connection pool and you set the isolation level to SERIALIZABLE for all connections.
Long-running transactions will be the problem. If all your write operations are fast you won't block. If you don't commit and flush updates quickly the read operations will either have to block and wait OR allow "dirty reads".

That depends. You're basically describing a race condition - if you want to make sure that your read-thread only reads after the write-thread has persisted, you will have to look into thread synchronization methodology.
Cheers,

Related

Is an attached Record thread-safe?

Is an attached jOOQ Record (UpdatableRecord) thread-safe, i.e. can I attach (fetch) a Record in one thread, and store it later in another thread without negative effects? Should I detach it in the original thread and attach it back in the new thread?
I know about the jOOQ manual page about thread-safety of the DSLContext. I'm using the Spring Boot Autoconfiguration of jOOQ, so that should all be thread-safe (with Spring's DataSourceTransactionManager and Hikari pooling).
But the following questions remain:
How does an attached Record behave when a transaction in the original thread is opened, and store() is called in another thread either before or after the original transaction has been committed? Does jOOQ open a new connection every time for each operation?
Would the attached Record be keeping a connection open across threads, which might then lead to resource leaks?

A jOOQ record is not thread safe. It is a simple mutable container backed by an ordinary Object[]. As such, all the usual issues may arise when sharing mutable state across threads.
But your question isn't really about the thread safety of the record.
How does an attached Record behave when a transaction in the original thread is opened, and store() is called in another thread either before or after the original transaction has been committed? Does jOOQ open a new connection every time for each operation?
This has nothing to do with Record, but how you configure jOOQ's ConnectionProvider. jOOQ doesn't hold a connection or even open one. You do that, explicitly, or implicitly, by passing jOOQ a connection via a ConnectionProvider (probably via some Spring configured DataSource). jOOQ will, for each database interaction, acquire() a connection, and release() it again after the interaction. The Record doesn't know how this connection is obtained. It just runs jOOQ queries that acquire and release connections.
In fact, jOOQ doesn't even really care about your transactions (unless you're using jOOQ's transaction API, but you aren't).
Would the attached Record be keeping a connection open across threads, which might then lead to resource leaks?
No, a Record is "attached" to a Configuration, not a connection. That Configuration contains a ConnectionProvider, which does whatever you configured it to do.

How to properly implement Optimistic Locking at the application layer?

I am a little confused as to why Optimistic Locking is actually safe. If I am checking the version at the time of retrieval with the version at the time of update, it seems like I can still have two requests enter the update block if the OS issues an interrupt and swaps the processes before the commit actually occurs. For example:
latestVersion = vehicle.getVersion();
if (vehicle.getVersion() == latestVersion) {
// update record in database
} else {
// don't update record
}
In this example, I am trying to manually use Optimistic Locking in a Java application without using JPA / Hibernate. However, it seems like two requests can enter the if block at the same time. Can you please help me understand how to do this properly? For context, I am also using Java Design Patterns website as an example.

Well... that's the optimistic part. The optimism is that it is safe. If you have to be certain it's safe, then that's not optimistic.
The example you show definitely is susceptible to a race condition. Not only because of thread scheduling, but also due to transaction isolation level.
A simple read in MySQL, in the default transaction isolation level of REPEATABLE READ, will read the data that was committed at the time your transaction started.
Whereas updating data will act on the data that is committed at the time of the update. If some other concurrent session has updated the row in the database in the meantime, and committed it, then your update will "see" the latest committed row, not the row viewed by your get method.
The way to avoid the race condition is to not be optimistic. Instead, force exclusive access to the record. Doveryai, no proveryai.
If you only have one app instance, you might use a critical section for this.
If you have multiple app instances, critical sections cannot coordinate other instances, so you need to coordinate in the database. You can do this by using pessimistic locking. Either read the record using a locking read query, or else you can use MySQL's user-defined locks.

Single transaction across multiple threads solution

As I understand it, all transactions are Thread-bound (i.e. with the context stored in ThreadLocal). For example if:
I start a transaction in a transactional parent method
Make database insert #1 in an asynchronous call
Make database insert #2 in another asynchronous call
Then that will yield two different transactions (one for each insert) even though they shared the same "transactional" parent.
For example, let's say I perform two inserts (and using a very simple sample, i.e. not using an executor or completable future for brevity, etc.):
#Transactional
public void addInTransactionWithAnnotation() {
addNewRow();
addNewRow();
}
Will perform both inserts, as desired, as part of the same transaction.
However, if I wanted to parallelize those inserts for performance:
#Transactional
public void addInTransactionWithAnnotation() {
new Thread(this::addNewRow).start();
new Thread(this::addNewRow).start();
}
Then each one of those spawned threads will not participate in the transaction at all because transactions are Thread-bound.
Key Question: Is there a way to safely propagate the transaction to the child threads?
The only solutions I've thought of to solve this problem:
Use JTA or some XA manager, which by definition should be able to do
this. However, I ideally don't want to use XA for my solution
because of it's overhead
Pipe all of the transactional work I want performed (in the above example, the addNewRow() function) to a single thread, and do all of the prior work in the multithreaded fashion.
Figuring out some way to leverage InheritableThreadLocal on the Transaction status and propagate it to the child threads. I'm not sure how to do this.
Are there any more solutions possible? Even if it's tastes a little bit of like a workaround (like my solutions above)?

The JTA API has several methods that operate implicitly on the current Thread's Transaction, but it doesn't prevent you moving or copying a Transaction between Threads, or performing certain operations on a Transaction that's not bound to the current (or any other) Thread. This causes no end of headaches, but it's not the worst part...
For raw JDBC, you don't have a JTA Transaction at all. You have a JDBC Connection, which has its own ideas about transaction context. In which case, the transaction is Connection bound, not thread bound. Pass the Connection around and the tx goes with it. But Connections aren't necessarily threadsafe and are probably a performance bottleneck anyhow, so sharing one between multiple concurrent threads doesn't really help you. You likely need multiple Connections that think they are in the same Transaction, which means you need XA, since that's how the db identifies such cases. At which point you're back to JTA, but now with a JCA in the picture to handle the Connection management properly. In short, you've reinvented the JavaEE application server.
For frameworks that layer on JDBC e.g. ORMs like Hibernate, you have an additional complication: their abstractions are not necessarily threadsafe. So you can't have a Session that is bound to multiple Threads concurrently. But you can have multiple concurrent Sessions that each participate in the same XA transaction.
As usual it boils down to Amdahl's law. If the speedup you get from using multiple Connections per tx to allow for multiple concurrent Threads to share the db I/O work is large relative to what you get from batching, then the overhead of XA is worthwhile. If the speedup is in local computation and the db I/O is a minor concern, then a single Thread that handles the JDBC Connection and offloads non-IO computation work to a Thread pool is the way to go.

First, a clarification: if you want to speed up several inserts of the same kind, as your example suggests, you will probably get the best performance by issuing the inserts in the same thread and using some type of batch inserting. Depending on your DBMS there are several techniques available, look at:
Efficient way to do batch INSERTS with JDBC
What's the fastest way to do a bulk insert into Postgres?
As for your actual question, I would personally try to pipe all the work to a worker thread. It is the simplest option as you don't need to mess with either ThreadLocals or transaction enlistment/delistment. Furthermore, once you have your units of work in the same thread, if you are smart you might be able to apply the batching techniques above for better performance.
Lastly, piping work to worker threads does not mean that you must have a single worker thread, you could have a pool of workers and achieve some parallelism if it is really beneficial to your application. Think in terms of producers/consumers.

Singleton or Connection pool for high perfs?

Context
I have a RESTful API for a versus fighting game, using JAX-RS, tomcat8 and Neo4j embedded.
Today I figured that a lot of queries will be done in a limited time, I'm using embedded for faster queries but I still want to go as fast as possible.
Problem
In fact, the problem is a bit different but not that much.
Actually, I'm using a Singleton with a getDabatase() method returning the current GraphDatabaseServiceinstance to begin a transaction, once it's done, the transaction is closed... and that's all.
I don't know if the best solution for optimal perfs is a Singleton pattern or a pool one (like creating XX instances of database connection, and reuse them when the database operation is finished).
I can't test it myself actually, because I don't have enough connections to even know which one is the fastest (and the best overall).
Also, I wonder if I create a pool of GraphDatabaseService instances, will they all be able to access the same datas without getting blocked by the lock?

Crate only one on GraphDatabaseService instance and use it everywhere. There are no need to create instance pool for them. GraphDatabaseService is completely thread-safe, so you can not worry about concurrency (note: transaction are thread-bound, so you can't run multiple transactions in same thread).
All operations in Neo4j should be executed in Transaction. On commit transaction is written in transaction log, and then persisted into database. General rules are:
Always close transaction as early as possible (use try-with-resource)
Close all resources as early as possible (ResourceIterator returned by findNodes() and execute())
Here you can find information about locking strategy.
To be sure that you have best performance, you should:
Check database settings (memory mapping)
Check OS settings (file system)
Check JVM settings (GC, heap size)
Data model
Here you can find some articles about Neo4j configuration & optimizations. All of them have useful information.

Use a pool - definitely.
Creating a database connection is generally very expensive. Using a pool will ensure that connections are kept for a reasonable mount of time and re-used whenever possible.

Is making a method synchronized will ensure that it is thread safe?

I have a method in which some database insert operations are happening using hibernate and i want them to be thread safe. The method is getting some data in parametres and its a possiblity that sometimes two calls are made with same data at same point of time.
I can't lock those tables because of performance degradation. Can anyone suggest making the method as synchronized will solve issue?

Synchronizing a method will ensure that it can only be accessed by one thread at a time. If this method is your only means of writing to the database, then yes, this will stop two threads from writing at the same time. However, you still have to deal with the fact that you have multiple insert operations with the same data.

You should let Hibernate handle the concurrency, that's what it is meant to do. Don't assume Hibernate will lock anything: it supports optimistic transactions for exactly this purpose. Quote from the above link:
The only approach that is consistent with high concurrency and high scalability, is optimistic concurrency control with versioning. Version checking uses version numbers, or timestamps, to detect conflicting updates and to prevent lost updates. Hibernate provides three possible approaches to writing application code that uses optimistic concurrency.

Database Concurrency is handled by transactions. Transactions have the Atomic Consistent Isolated Durable (ACID) properties. They provide isolation between programs accessing a database concurrently. In the Hibernate DAO template of spring framework there are single line methods for CRUD operations on the database. When used individually these don't need to be synchronized by method. Spring provides declarative (XML), programmatic and annotation meta-data driven transaction management if you need to declare "your method" as transactional with specific propagation settings, rollbackFor settings, isolation settings. So in "your method" you can do multiple save,update,deletes etc and the ORM will ensure that it is executed with the transaction settings you have given in the meta-data.
Another issue is that the thread has to have the lock on all the objects that are taking part in the transaction.Otherwise the transaction might fail or the ORM will persist stale data. In another situation it can result in a deadlock because of lock-ordering. I think this is what really answers your question.
Both objects a and b have an instance variable of the type Lock. A boolean flag can be used to indicate the success of the transaction. The client code can retry the same transaction if it fails.
if (a.lock.tryLock()) {
try {
if (b.lock.tryLock()) {
try {
// persist or update object a and b
} finally {
b.lock.unlock();
}
}
} finally {
a.lock.unlock();
}
}
The problem with using synchronized methods is that it locks up the entire Service or DAO class making other service methods unavailable to other threads. By using individual locks on objects we can gain the advantage of fine grained concurrency.

No. This method probably uses another methods and objects, which may be not thread safe. synchronized makes threads to use that's method's object monitor only once at a time, so it makes thread-safe a method with respect to the object.
If you are sure that all other threads use shared functionality only with this method, then making it synchronized may be sufficient.

Choosing the best strategy depends on the architecture, sometimes to increase performance seems to be easier to use the trick like method synchronization, but this is bad approach.
There's no doubts, you should use transactions, and if with that strategy you're facing performance issues you should optimize your db queries or db structure.
Please remember that "Synchronization" should be as much as possible atomic.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.