Lock Ordering in C3p0 - java

I am trying to log the creation and destruction of database connections in our application using c3p0's ConnectionCustomizer. In it, I have some code that looks like this:
log(C3P0Registry.getPooledDataSources())
I'm running into deadlocks. I'm discovering that c3p0 has at least a couple of objects in its library that use synchronized methods, and don't seem to specify their intended lock ordering. When I log the connections, I'm holding a lock on C3P0Registry and eventually PoolBackedDataSource (simply creating a list of the datasources is accessing the hashcode causing a lock).
Shutting down the connection provider (calling C3P0ConnectionProvider.close()) causes the locks to be called in the opposite order. But while the child datasources are being shut down, my logging is being triggered. The result is a deadlock.
It seems like both calls I am making into the c3p0 library are valid, expected calls:
C3P0ConnectionProvider.close()
C3P0Registry.getPooledDataSources()
It also seems like (unless explicitly stated in the documentation) it should be the library's responsibility to manage it's own locking strategy. (I don't say this to blame anyone.. just to confirm my understanding of best practices)
How should I deal with this issue? Since c3p0 uses synchronized methods rather than a more modern mechanism, I can't really test the locks.
From my DataSource closing code, I could first grab the C3P0Registry lock before closing the DataSource. I would be guessing at the correct lock order, which I don't know if I feel comfortable with.
I don't think I could reverse the lock order for the logging call. I need the C3P0Registry to get the list of DataSources, so I couldn't lock the DataSources without first locking C3P0Registry to get references to them.
Another solution, of course is to provide another, higher level lock above everything c3p0. In the case of a connection pool, that seems to defeat the point.
For now, I'm rolling back my logging. Thanks for any help.

I dont know how to fix the locking issue, but i think you should take a step back here and think about the original problem.
"I am trying to log the creation and destruction of database connections in our application ..."
I would recommend the following.
Create a class and make it implement javax.sql.DataSource.
Create a field of the same type and delegate all methods to it.
In the getConnection() method return your own Connection class wrapping around
java.sql.Connection and so on.
Then wrap this class around your original data source.
In your classes you can now simply create a logger and log all actions you want to see in your log.

Related

Is an attached Record thread-safe?

Is an attached jOOQ Record (UpdatableRecord) thread-safe, i.e. can I attach (fetch) a Record in one thread, and store it later in another thread without negative effects? Should I detach it in the original thread and attach it back in the new thread?
I know about the jOOQ manual page about thread-safety of the DSLContext. I'm using the Spring Boot Autoconfiguration of jOOQ, so that should all be thread-safe (with Spring's DataSourceTransactionManager and Hikari pooling).
But the following questions remain:
How does an attached Record behave when a transaction in the original thread is opened, and store() is called in another thread either before or after the original transaction has been committed? Does jOOQ open a new connection every time for each operation?
Would the attached Record be keeping a connection open across threads, which might then lead to resource leaks?
A jOOQ record is not thread safe. It is a simple mutable container backed by an ordinary Object[]. As such, all the usual issues may arise when sharing mutable state across threads.
But your question isn't really about the thread safety of the record.
How does an attached Record behave when a transaction in the original thread is opened, and store() is called in another thread either before or after the original transaction has been committed? Does jOOQ open a new connection every time for each operation?
This has nothing to do with Record, but how you configure jOOQ's ConnectionProvider. jOOQ doesn't hold a connection or even open one. You do that, explicitly, or implicitly, by passing jOOQ a connection via a ConnectionProvider (probably via some Spring configured DataSource). jOOQ will, for each database interaction, acquire() a connection, and release() it again after the interaction. The Record doesn't know how this connection is obtained. It just runs jOOQ queries that acquire and release connections.
In fact, jOOQ doesn't even really care about your transactions (unless you're using jOOQ's transaction API, but you aren't).
Would the attached Record be keeping a connection open across threads, which might then lead to resource leaks?
No, a Record is "attached" to a Configuration, not a connection. That Configuration contains a ConnectionProvider, which does whatever you configured it to do.

Singleton or Connection pool for high perfs?

Context
I have a RESTful API for a versus fighting game, using JAX-RS, tomcat8 and Neo4j embedded.
Today I figured that a lot of queries will be done in a limited time, I'm using embedded for faster queries but I still want to go as fast as possible.
Problem
In fact, the problem is a bit different but not that much.
Actually, I'm using a Singleton with a getDabatase() method returning the current GraphDatabaseServiceinstance to begin a transaction, once it's done, the transaction is closed... and that's all.
I don't know if the best solution for optimal perfs is a Singleton pattern or a pool one (like creating XX instances of database connection, and reuse them when the database operation is finished).
I can't test it myself actually, because I don't have enough connections to even know which one is the fastest (and the best overall).
Also, I wonder if I create a pool of GraphDatabaseService instances, will they all be able to access the same datas without getting blocked by the lock?
Crate only one on GraphDatabaseService instance and use it everywhere. There are no need to create instance pool for them. GraphDatabaseService is completely thread-safe, so you can not worry about concurrency (note: transaction are thread-bound, so you can't run multiple transactions in same thread).
All operations in Neo4j should be executed in Transaction. On commit transaction is written in transaction log, and then persisted into database. General rules are:
Always close transaction as early as possible (use try-with-resource)
Close all resources as early as possible (ResourceIterator returned by findNodes() and execute())
Here you can find information about locking strategy.
To be sure that you have best performance, you should:
Check database settings (memory mapping)
Check OS settings (file system)
Check JVM settings (GC, heap size)
Data model
Here you can find some articles about Neo4j configuration & optimizations. All of them have useful information.
Use a pool - definitely.
Creating a database connection is generally very expensive. Using a pool will ensure that connections are kept for a reasonable mount of time and re-used whenever possible.

Java logging across multiple threads

We have a system that uses threading so that it can concurrently handle different bits of functionality in parallel. We would like to find a way to tie all log entries for a particular "transaction" together. Normally, one might use 'threadName' to gather these together, but clearly that fails in a multithreaded situation.
Short of passing a 'transaction key' down through every method call, I can't see a way to tie these together. And passing a key into every single method is just ugly.
Also, we're kind of tied to Java logging, as our system is built on a modified version of it. So, I would be interested in other platforms for examples of what we might try, but switching platforms is highly unlikely.
Does anyone have any suggestions?
Thanks,
Peter
EDIT: Unfortunately, I don't have control over the creation of the threads as that's all handled by a workflow package. Otherwise, the idea of caching the ID once for each thread (on ThreadLocal maybe?) then setting that on the new threads as they are created is a good idea. I may try that anyway.
You could consider creating a globally-accessible Map that maps a Thread's name to its current transaction ID. Upon beginning a new task, generate a GUID for that transaction and have the Thread register itself in the Map. Do the same for any Threads it spawns to perform the same task. Then, when you need to log something, you can simply lookup the transaction ID from the global Map, based on the current Thread's name. (A bit kludgy, but should work)
This is a perfect example for AspectJ crosscuts. If you know the methods that are being called you can put interceptors on them and bind dynamically.
This article will give you several options http://www.ibm.com/developerworks/java/library/j-logging/
However you mentioned that your transaction spans more than one thread, take a look at how log4j cope with binding additional information to current thread with MDC and NDC classes. It uses ThreadLocal as you were advised before, but interesting thing is how log4j injects data into log messages.
//In the code:
MDC.put("RemoteAddress", req.getRemoteAddr());
//In the configuration file, add the following:
%X{RemoteAddress}
Details:
http://onjava.com/pub/a/onjava/2002/08/07/log4j.html?page=3
http://wiki.apache.org/logging-log4j/NDCvsMDC
How about naming your threads to include the transaction ID? Quick and Dirty, admittedly, but it should work (until you need the thread name for something else or you start reusing threads in a thread pool).
If you are logging, then you must have some kind of logger object. You should have a spearate instance in each thread.
add a method to it called setID(String id).
When it is initialized in your thread, set a unique ID using the method.
prepend the set iD to each log entry.
A couple people have suggested answers that have the newly spawned thread somehow knowing what the transaction ID is. Unless I'm missing something, in order to get this ID into the newly spawned thread, I would have to pass it all the way down the line into the method that spawns the thread, which I'd rather not do.
I don't think you need to pass it down, but rather the code responsible for handing work to these threads needs to have the transactionID to pass. Wouldn't the work-assigner have this already?

java methods and race condition in a jsp/servlets application

Suppose that I have a method called doSomething() and I want to use this method in a multithreaded application (each servlet inherits from HttpServlet).I'm wondering if it is possible that a race condition will occur in the following cases:
doSomething() is not staic method and it writes values to a database.
doSomething() is static method but it does not write values to a database.
what I have noticed that many methods in my application may lead to a race condition or dirty read/write. for example , I have a Poll System , and for each voting operation, a certain method will change a single cell value for that poll as the following:
[poll_id | poll_data ]
[1 | {choice_1 : 10, choice_2 : 20}]
will the JSP/Servlets app solve these issues by itself, or I have to solve all that by myself?
Thanks..
It depends on how doSomething() is implemented and what it actually does. I assume writing to the database uses JDBC connections, which are not threadsafe. The preferred way of doing that would be to create ThreadLocal JDBC connections.
As for the second case, it depends on what is going on in the method. If it doesn't access any shared, mutable state then there isn't a problem. If it does, you probably will need to lock appropriately, which may involve adding locks to every other access to those variables.
(Be aware that just marking these methods as synchronized does not fix any concurrency bugs. If doSomething() incremented a value on a shared object, then all accesses to that variable need to be synchronized since i++ is not an atomic operation. If it is something as simple as incrementing a counter, you could use AtomicInteger.incrementAndGet().)
The Servlet API certainly does not magically make concurrency a non-issue for you.
When writing to a database, it depends on the concurrency strategy in your persistence layer. Pessimistic locking, optimistic locking, last-in-wins? There's way more going on when you 'write to a database' that you need to decide how you're going to handle. What is it you want to have happen when two people click the button at the same time?
Making doSomething static doesn't seem to have too much bearing on the issue. What's happening in there is the relevant part. Is it modifying static variables? Then yes, there could be race conditions.
The servlet api will not do anything for you to make your concurrency problems disappear. Things like using the synchronized keyword on your servlets are a bad idea because you are basically forcing your threads to be processed one at a time and it ruins your ability to respond quickly to multiple users.
If you use Spring or EJB3, either one will provide threadlocal database connections and the ability to specify transactions. You should definitely check out one of those.
Case 1, your servlet uses some code that accesses a database. Databases have locking mechanisms that you should exploit. Two important reasons for this: the database itself might be used from other applications that read and write that data, it's not enough for your app to deal with contending with itself. And: your own application may be deployed to a scaled, clustered web container, where multiple copies of your code are executing on separate machines.
So, there are many standard patterns for dealing with locks in databases, you may need to read up on Pessimistic and Optimistic Locking.
The servlet API and JBC connection pooling gives you some helpful guarantees so that you can write your servlet code without using Java synchronisation provided your variables are in method scope, in concept you have
Start transaction (perhaps implicit, perhaps on entry to an ejb)
Get connection to DB ( Gets you a connection from pool, associated with your tran)
read/write/update code
Close connection (actually keeps it for your thread until your transaction commits)
Commit (again maybe implictly)
So your only real issue is dealing with any contentions in the DB. All of the above tends to be done rather more nicely using things such as JPA these days, but under the covers thats more or less what's happening.
Case 2: static method, this presumably implies that you now keep everything in a memory structure. This (barring remote invocation of some sort) impies a single JVM and you managing your own locking. Should your JVM or machine crash I guess you lose your data. If you care about your data then using a DB is probably better.
OR, how about a completely other approach: servlet simply records the "vote" by writing a message to a persistent JMS queue. Have some other processes pick up the votes from the queue and adds them up. You won't give immediate feedback to the voter this way, but you decouple the user's experience from the actual (in similar scenarios) quite complex processing .
I thing that the best solution for your problem is to use something like "synchronized" keyword and wait/notify!

How many JDBC connections in Java?

I have a Java program consisting of about 15 methods. And, these methods get invoked very frequently during the exeuction of the program. At the moment, I am creating a new connection in every method and invoking statements on them (Database is setup on another machine on the network).
What I would like to know is: Should I create only one connection in the main method and pass it as an argument to all the methods that require a connection object since it would significantly reduce the number of connections object in the program, instead of creating and closing connections very frequently in every method.
I suspect I am not using the resources very efficiently with the current design, and there is a lot of scope for improvement, considering that this program might grow a lot in the future.
Yes, you should consider re-using connections rather than creating a new one each time. The usual procedure is:
make some guess as to how many simultaneous connections your database can sensibly handle (e.g. start with 2 or 3 per CPU on the database machine until you find out that this is too few or too many-- it'll tend to depend on how disk-bound your queries are)
create a pool of this many connections: essentially a class that you can ask for "the next free connection" at the beginning of each method and then "pass back" to the pool at the end of each method
your getFreeConnection() method needs to return a free connection if one is available, else either (1) create a new one, up to the maximum number of connections you've decided to permit, or (2) if the maximum are already created, wait for one to become free
I'd recommend the Semaphore class to manage the connections; I actually have a short article on my web site on managing a resource pool with a Semaphore with an example I think you could adapt to your purpose
A couple of practical considerations:
For optimum performance, you need to be careful not to "hog" a connection while you're not actually using it to run a query. If you take a connection from the pool once and then pass it to various methods, you need to make sure you're not accidentally doing this.
Don't forget to return your connections to the pool! (try/finally is your friend here...)
On many systems, you can't keep connections open 'forever': the O/S will close them after some maximum time. So in your 'return a connection to the pool' method, you'll need to think about 'retiring' connections that have been around for a long time (build in some mechanism for remembering, e.g. by having a wrapper object around an actual JDBC Connection object that you can use to store metrics such as this)
You may want to consider using prepared statements.
Over time, you'll probably need to tweak the connection pool size
You can either pass in the connection or better yet use something like Jakarta Database Connection Pooling.
http://commons.apache.org/dbcp/
You should use a connection pool for that.
That way you could ask for the connection and release it when you are finish with it and return it to the pool
If another thread wants a new connection and that one is in use, a new one could be created. If no other thread is using a connection the same could be re-used.
This way you can leave your app somehow the way it is ( and not passing the connection all around ) and still use the resources properly.
Unfortunately first class ConnectionPools are not very easy to use in standalone applications ( they are the default in application servers ) Probably a microcontainer ( such as Sping ) or a good framework ( such as Hibernate ) could let you use one.
They are no too hard to code one from the scratch though.
:)
This google search will help you to find more about how to use one.
Skim through
Many JDBC drivers do connection pooling for you, so there is little advantage doing additional pooling in this case. I suggest you check the documentation for you JDBC driver.
Another approach to connection pools is to
Have one connection for all database access with synchronised access. This doesn't allow concurrency but is very simple.
Store the connections in a ThreadLocal variable (override initialValue()) This works well if there is a small fixed number of threads.
Otherwise, I would suggest using a connection pool.
If your application is single-threaded, or does all its database operations from a single thread, it's ok to use a single connection. Assuming you don't need multiple connections for any other reason, this would be by far the simplest implementation.
Depending on your driver, it may also be feasible to share a connection between threads - this would be ok too, if you trust your driver not to lie about its thread-safety. See your driver documentation for more info.
Typically the objects below "Connection" cannot safely be used from multiple threads, so it's generally not advisable to share ResultSet, Statement objects etc between threads - by far the best policy is to use them in the same thread which created them; this is normally easy because those objects are not generally kept for too long.

Categories