Hibernate faster EntityManagerFactory creation

Hibernate faster EntityManagerFactory creation - java

In my desktop application new databases get opened quite often. I use Hibernate/JPA as an ORM.
The problem is, creating the EntityManagerFactory is quite slow, taking about 5-6 Seconds on a fast machine. I know that the EntityManagerFactory is supposed to be heavyweight but this is just too slow for a desktop application where the user expects the new database to be opened quickly.
Can I turn off some EntityManagerFactory features to get an instance
faster? Or is it possible to create some of the EntityManagerFactory lazily to speed up cration?
Can I somehow create the EntityManagerFactory object before
knowing the database url? I would be happy to turn off all
validation for this to be possible.
By doing so, can I pool EntityManagerFactorys for later use?
Any other idea how to create the EntityManagerFactory faster?
Update with more Information and JProfiler profiling
The desktop application can open saved files. Our application document file format constists of 1 SQLite database + and some binary data in a ZIP file. When opening a document, the ZIP gets extracted and the db is opened with Hibernate. The databases all have the same schema, but different data obviously.
It seems that the first time I open a file it takes significantly longer than the following times.
I profiled the first and second run with JProfiler and compared the results.
1st Run:
create EMF: 4385ms
build EMF: 3090ms
EJB3Configuration configure: 900ms
EJB3Configuration <clinit>: 380ms
.
2nd Run:
create EMF: 1275ms
build EMF: 970ms
EJB3Configuration configure: 305ms
EJB3Configuration <clinit>: not visible, probably 0ms
.
In the Call tree comparison you can see that some methods are significantly faster (DatabaseManager. as starting point):
create EMF: -3120ms
Hibernate create EMF: -3110ms
EJB3Configuration configure: -595ms
EJB3Configuration <clinit>: -380ms
build EMF: -2120ms
buildSessionFactory: -1945ms
secondPassCompile: -425ms
buildSettings: -346ms
SessionFactoryImpl.<init>: -1040ms
The Hot spot comparison now has the interesting results:
.
ClassLoader.loadClass: -1686ms
XMLSchemaFactory.newSchema: -184ms
ClassFile.<init>: -109ms
I am not sure if it is the loading of Hibernate classes or my Entity classes.
A first improvement would be to create an EMF as soon as the application starts just to initialize all necessary classes (I have an empty db file as a prototype already shipped with my Application). #sharakan thank you for your answer, maybe a DeferredConnectionProvider would already be a solution for this problem.
I will try the DeferredConnectionProvider next! But we might be able to speed it up even further. Do you have any more suggestions?

You should be able to do this by implementing your own ConnectionProvider as a decorator around a real ConnectionProvider.
The key observation here is that the ConnectionProvider isn't used until an EntityManager is created (see comment in supportsAggressiveRelease() for a caveat to that). So you can create a DeferredConnectionProvider class, and use it to construct the EntityManagerFactory, but then wait for user input, and do the deferred initialization before actually creating any EntityManager instances. I'm written this as a wrapper around ConnectionPoolImpl, but you should be able to use any other implementation of ConnectionProvider as the base.
public class DeferredConnectionProvider implements ConnectionProvider {
private Properties configuredProps;
private ConnectionProviderImpl realConnectionProvider;
#Override
public void configure(Properties props) throws HibernateException {
configuredProps = props;
}
public void finalConfiguration(String jdbcUrl, String userName, String password) {
configuredProps.setProperty(Environment.URL, jdbcUrl);
configuredProps.setProperty(Environment.USER, userName);
configuredProps.setProperty(Environment.PASS, password);
realConnectionProvider = new ConnectionProviderImpl();
realConnectionProvider.configure(configuredProps);
}
private void assertConfigured() {
if (realConnectionProvider == null) {
throw new IllegalStateException("Not configured yet!");
}
}
#Override
public Connection getConnection() throws SQLException {
assertConfigured();
return realConnectionProvider.getConnection();
}
#Override
public void closeConnection(Connection conn) throws SQLException {
assertConfigured();
realConnectionProvider.closeConnection(conn);
}
#Override
public void close() throws HibernateException {
assertConfigured();
realConnectionProvider.close();
}
#Override
public boolean supportsAggressiveRelease() {
// This gets called during EntityManagerFactory construction, but it's
// just a flag so you should be able to either do this, or return
// true/false depending on the actual provider.
return new ConnectionProviderImpl().supportsAggressiveRelease();
}
}
a rough example of how to use it:
// Get an EntityManagerFactory with the following property set:
// properties.put(Environment.CONNECTION_PROVIDER, DeferredConnectionProvider.class.getName());
HibernateEntityManagerFactory factory = (HibernateEntityManagerFactory) entityManagerFactory;
// ...do user input of connection info...
SessionFactoryImpl sessionFactory = (SessionFactoryImpl) factory.getSessionFactory();
DeferredConnectionProvider connectionProvider = (DeferredConnectionProvider) sessionFactory.getSettings()
.getConnectionProvider();
connectionProvider.finalConfiguration(jdbcUrl, userName, password);
You could put the initial set up of the EntityManagerFactory on a separate thread or something, so that the user never has to wait for it. Then the only thing they'll wait for, after specifying the connection info, is the setting up of the connection pool, which should be fairly quick compared to parsing the object model.

Can I turn off some EntityManagerFactory features to get an instance faster?
Don't believe so. EMFs don't really have too many features, other than initializing a JDBC connection/pool.
Or is it possible to create some of the EntityManagerFactory lazily to
speed up cration?
Rather than creating the EMF lazily, when the user will notice the performance hit, I suggest you should head in the opposite direction - create the EMF proactively before the user actually needs it. Create it once, up-front, possibly in a separate thread during application initialisation (or at least as soon as you know about your database). Reuse it throughout the existence of your application/database.
Can I somehow create the EntityManagerFactory object before knowing the database url?
No - it creates a JDBC connection.
I think a better question is: why does your application dynamically discover database connection URLs? Are you saying your databases are created/made available on-the-fly and there's no way to anticipate in advance the connection parameters. That really is to be avoided.
By doing so, can I pool EntityManagerFactorys for later use?
No, you can't pool EMFs. It's the connections that you can pool.
Any other idea how to create the EntityManagerFactory faster?
I agree - 6 seconds is too slow for initialisation of EMFs.
I suspect it's more to do with your selected database technology than JPA/JDBC/JVM. My guess is that maybe your database is initialising itself as you connect. Are you using Access? What DB are you using?
Are you connecting to a database remotely located? Over a WAN? Is network speed/latency good?
Are the client PCs limited in performance?
EDIT: Added after comments
Implementing your own ConnectionProvider as a decorator around a real ConnectionProvider will not speed up the user's experience at all. The database instance still needs to be initialised, the EMF & EM created and the JDBC connection still needs to be subsequently established.
Options:
Share a common preloaded DB instance: seems not possible for your business scenario (although JSE technology supports this and also supports client-server design).
Change to a DB with a faster startup: Derby (a.k.a. Java DB) is included in modern JVMs and has a startup time of about 1.5 seconds (cold) and 0.7 seconds (warm - data pre-loaded).
In many (most?) scenarios, the fastest solution would be to load data directly into in-memory java objects using JAXB with STAX. Subsequently, use in-memory cached data (particularly using smart structures like maps, hashing and arraylists). Just as JPA can map POJO classes to database tables & columns, so JAXB can map POJO classes to XML schema & work with XML doc instances. If you have very complex queries using SQL set-based logic with multiple joins and strong use of DB indexes, this would be less desirable.
(2) would probably give the best improvement for limited effort.
Additionally:
- try to unzip the data files during deployment rather than during app usage.
- initialize the EMF in a startup thread that runs in parallel to the UI startup - try to start the DB initializing as one of the very first steps of the app (that means connecting to the actual instance using JDBC).

Related

JUnit test for a method that contains SQL queries

I have an old Java project (no frameworks/build tools used) that has a class full of SQL methods and corresponding Bean-classes. The SQL methods mostly use SELECT, INSERT and UPDATE queries like this:
public static void sqlUpdateAge(Connection dbConnection, int age, int id) {
PreparedStatement s = null;
ResultSet r = null;
String sql = "UPDATE person SET age = ? WHERE id = ?";
try {
s = dbConnection.prepareStatement(sql);
s.setInt(1, age);
s.setInt(2, id);
s.addBatch();
s.executeBatch();
} catch (Exception e) {
e.printStackTrace();
} finally {
try {
if (r != null)
r.close();
if (s != null)
s.close();
} catch (SQLException e) {
e.printStackTrace();
}
}
}
What is the best practice in unit testing when it comes to SQL queries?
The easiest way I can think of, would be to use my development database; just call the sqlUpdateAge() in the test class, query the database for a result set and assertTrue that the set age is in the result set. However, this would fill up the development database with unnecessary data, and I would like to avoid that.
Is the solution to create a so-called in-memory database or somehow rollback the changes I made?
If I need an in-memory database:
Where and how would I create it? Straight to the test class, or perhaps to a config file?
How do I pass it to the updateAge() method?

I would suggest to see if you can start with autmoating the build. That is either by introducing a build tool such as maven or gradle - or if not possible - scipting the build. In any case, your goal should be to get to a point where it's easy for you to trigger a buil together with tests whenever code changes.
If you are not able to produce a consistent build on every change with the guarantee that all unit tests have been run, then there's really no value in writing unit tests in the first place. That is because otherwise, your tests are going to fail eventually due to code modifications and you wouldn't notice unless all your tests are automatically run.
Once you have that, you might have some hints to how you would like to run unit or integration tests.
As you can't benefit from testing support that many application frameworks provide, you're basically left on your own for how to configure database testing setup. In that case, I don't think that an inmemory database is really the best opion, because:
It's a different database technology than what you are normally using, and as the code indicates you are not using an ORM that will take care of different SQL dialects for you. As that's the case, you might find yourself in a position, where you are unable to accurately test your code because of SQL dialect differeces.
You will need to do all the setup of the inmemory DB yourself - which is of course possible, but still it's a piece of code that you need to maintain and that can also fail.
The two alternatives I can think of are:
Use Docker to start your actual database technology for every time you run the tests. (that's also something you have to script for yourself, but it will most likely be a very simple and short command you need to execute)
have a test database running on your test environment that you use. Every time before you run the tests, esure the database is reset to the original state. (easiest way to do this is to drop the existing schema and restore to the original schema). In this case, you will need to ensure that you don't run multiple builds in parallell against the same test database.
These suggestions apply only if you have experience on the shell and/or have support from someone in ops. If not, setting up H2 might be easier and more straight forward.

Things would have been easy with a Spring Boot project. In your case, you have many strategies:
Configure a H2 database. You can initialize your database with the creation of a schema and insertion of data in a setUp method with the #BeforeEach annotation.
You can use a dedicated framework like DbUnit.
You will have to initialize your dbConnection also in a setUp method in your unit test.

Shared Transaction between different OracleDB Connections

After several days passed to investigate about the issue, I decided to submit this question because there is no sense apparently in what is happening.
The Case
My computer is configured with a local Oracle Express database.
I have a JAVA project with several JUnit Tests that extend a parent class (I know that it is not a "best practice") which opens an OJDBC Connection (using a static Hikari connection pool of 10 Connections) in the #Before method and rolled Back it in the #After.
public class BaseLocalRollbackableConnectorTest {
private static Logger logger = LoggerFactory.getLogger(BaseLocalRollbackableConnectorTest.class);
protected Connection connection;
#Before
public void setup() throws SQLException{
logger.debug("Getting connection and setting autocommit to FALSE");
connection = StaticConnectionPool.getPooledConnection();
}
#After
public void teardown() throws SQLException{
logger.debug("Rollback connection");
connection.rollback();
logger.debug("Close connection");
connection.close();
}
StacicConnectionPool
public class StaticConnectionPool {
private static HikariDataSource ds;
private static final Logger log = LoggerFactory.getLogger(StaticConnectionPool.class);
public static Connection getPooledConnection() throws SQLException {
if (ds == null) {
log.debug("Initializing ConnectionPool");
HikariConfig config = new HikariConfig();
config.setMaximumPoolSize(10);
config.setDataSourceClassName("oracle.jdbc.pool.OracleDataSource");
config.addDataSourceProperty("url", "jdbc:oracle:thin:#localhost:1521:XE");
config.addDataSourceProperty("user", "MyUser");
config.addDataSourceProperty("password", "MyPsw");
config.setAutoCommit(false);
ds = new HikariDataSource(config);
}
return ds.getConnection();
}
}
This project has hundreds tests (not in parallel) that use this connection (on localhost) to execute queries (insert/update and select) using Sql2o but transaction and clousure of connection is managed only externally (by the test above).
The database is completely empty to have ACID tests.
So the expected result is to insert something into DB, makes the assertions and then rollback. in this way the second test will not find any data added by previous test in order to maintain the isolation level.
The Problem
Running all tests together (sequentially), 90% of times they work properly. the 10% one or two tests, randomly, fail, because there is dirty data in the database (duplicated unique for example) by previous tests. looking the logs, rollbacks of previous tests were done properly. In fact, if I check the database, it is empty)
If I execute this tests in a server with higher performance but the same JDK, same Oracle DB XE, this failure ratio is increased to 50%.
This is very strange and I have no idea because the connections are different between tests and the rollback is called each time. The JDBC Isolation level is READ COMMITTED so even if we used the same connection, this should not create any problem even using the same connection.
So my question is:
Why it happen? do you have any idea? Is the JDBC rollback synchronous as I know or there could be some cases where it can go forward even though it is not fully completed?
These are my main DB params:
processes 100
sessions 172
transactions 189

I have run into the same problem 2-3 years ago (I have spent a lot of time to get this straight). The problem is that the #Before and #After is not always really sequential. [You could try this by starting the process in debug and place some breakpoints in the annotated methods.
Edit: I was not clear enough as Tonio pointed out. The order of #Before and #After is guarantied in terms of running before the test and afterwards it. The problem was in my case that sometimes the #Before and the #After was messed up.
Expected:
#Before -> test1() -> #After -> #Before -> #test2() -> #After
But sometimes I experienced the following order:
#Before -> test1() -> #Before -> #After -> #test2() -> #After
I am not sure thet it is a bug or not. At the time I dug into the depth of it and it seemed like some kind of (processor?) scheduling related magic.
The solution to that problem was in our case to run the tests on a single thread and call manually the init and cleanup processes... Something like this:
public class BaseLocalRollbackableConnectorTest {
private static Logger logger = LoggerFactory.getLogger(BaseLocalRollbackableConnectorTest.class);
protected Connection connection;
public void setup() throws SQLException{
logger.debug("Getting connection and setting autocommit to FALSE");
connection = StaticConnectionPool.getPooledConnection();
}
public void teardown() throws SQLException{
logger.debug("Rollback connection");
connection.rollback();
logger.debug("Close connection");
connection.close();
}
#Test
public void test() throws Exception{
try{
setup();
//test
}catch(Exception e){ //making sure that the teardown will run even if the test is failing
teardown();
throw e;
}
teardown();
}
}
I have not tested it but a much more elegant solution could be to syncronize the #Before and #After methods on the same object. Please update me if You have the chanse to give it a try. :)
I hope it will solve your problem too.

If your problem just needs to be "solved" (e.g. not "best practice") regardless of performance to just make the tests complete in order, try to set:
config.setMaximumPoolSize(1);
You might need to set a timeout higher since the tests in the test queue will wait for its turn and might timeout. I usually don't suggest solutions like this but your setup is suboptimal, it will lead to race conditions and data loss. However, good luck with the tests.

Try configure audit on all statements in Oracle. Then find sessions which live simultaneously. I think that there is the problem in tests. JDBC rollback is synchronous. Commit can be configured as commit nowait but I don't think you do it special in your tests.
Also pay attention on parallel dml. On one table in the same transaction you can't do parallel dml + any other dml without commit because you get Ora-12838.
Do you have autonoumous transaction? Business logic in tests can manually rollback them and during tests autonoumous transaction is like another session and it doesn't see any commits from parent session.

Not sure if this will fix it, but you could try:
public class BaseLocalRollbackableConnectorTest {
private static Logger logger = LoggerFactory.getLogger(BaseLocalRollbackableConnectorTest.class);
protected Connection connection;
private Savepoint savepoint;
#Before
public void setup() throws SQLException{
logger.debug("Getting connection and setting autocommit to FALSE");
connection = StaticConnectionPool.getPooledConnection();
savepoint = connection.setSavepoint();
}
#After
public void teardown() throws SQLException{
logger.debug("Rollback connection");
connection.rollback(savepoint);
logger.debug("Close connection");
connection.close();
while (!connection.isClosed()) {
try { Thread.sleep(500); } catch (InterruptedException ie) {}
}
}
Really there are two 'fixes' there - loop after the close to be sure the connection IS closed before returning to the pool. Second, create a savepoint before the test and restore it afterwards.

Like all other answers have pointed out, it's hard to say what goes wrong with the provided information. Further more, even if you manage to find the current issue by audit, it doesn't mean that your tests are free from data errors.
But here's an alternative: because you already have a blank database schema, you can export it to a SQL file. Then before each test:
Drop the schema
Re-create the schema again
Feed the sample data (if needed)
It would save lots of time debugging, make sure the database in its pristine state every time you run the tests. All of this can be done in a script.
Note: Oracle Enterprise has the flashback function to support your kind of operation. Also, if you can manage to use Hibernate and the likes, there's other in-memory databases (like HSQLDB) that you can utilize to both increase testing speed and maintain coherence in your data set.
EDIT: It seems implausible, but just in case: connection.rollback() only takes effect if you don't call commit()
before it.

After all confirmation from your answers that I am not mad with Rollbacks and transactions behavior in unit tests, i deeply checked all queries and all possible causes and fortunately (yes furtunately...even if I'm ashamed for that, I make my mind free) all works as expected (Transactions, Before, After, etc).
There are some queries that get the result of some complex views (and radically deep configured into the DAO layer) to identify the single row information.
This view is based on the MAX of a TIMESTAMP in order to identify latest of a particular event (in the real life the events coming after several months).
Doing the preparation of the database to proceed with the unit tests, these events are added sequentially by each test.
In some cases, when these insert queries under the same transaction are particular fast, more events related to the same object are added in the same Millisecond (The TIMESTAMP is added manually using a JODA DateTime) and the MAX of a date, returns two or more values.
For this reason it is explained the fact that on more performant computers/servers, this happened more frequently than the slower ones.
This view is used in more tests and depending by the test, the error is different and random (NULL value added as Primary Key, duplicated primary Key, etc) .
For Example: in the following INSERT SELECT query is evident this bug:
INSERT INTO TABLE1 (ID,COL1,COL2,COL3)
SELECT :myId, T.VAL1, T.VAL2, T.VAL3
FROM MyView v
JOIN Table2 t on t.ID = v.ID
WHERE ........
the parameter myId is added afterwards as Sql2o Parameter
MyView is
SELECT ID, MAX(MDATE) FROM TABLEV WHERE.... GROUP BY ...
When the view returns at least 2 results due to the same Max Date, it fails because the ID is fixed (generated by a sequence at beginning but stored using the parameter in a second time). This generates the PK constraint violated.
This is only one case but make me (and my colleagues) crazy due to this randomly behaviours...
Adding a sleep of 1 millisecond between those events insert, it is fixed. now we are working to find a different solution even though this case (an user that interact two times in the same millisecond) cannot happen in production system
but the important things is that no magic happens as usual!
Now you can insult me :)

You can do one thing increase the no. of connections in max pool size and rollback the operation in the same place where you committed the operation instead of using it in #after statement.
Hope it will work.

How to disable JBPM persistance?

I'm trying to implement a few tests with JBPM 6. I'm currently working a a simple hello world bpmn2 file, which is loaded correctly.
My understading of the documentation ( Click ) is that persistence should be disabled by default. "By default, if you do not configure the process engine otherwise, process instances are not made persistent."
However, when I try to implement it, and without doing anything special to enable persistence, I hit persistence related problems every time I try to do anything.
javax.persistence.PersistenceException: No Persistence provider for EntityManager named org.jbpm.persistence.jpa
at javax.persistence.Persistence.createEntityManagerFactory(Persistence.java:69)
at javax.persistence.Persistence.createEntityManagerFactory(Persistence.java:47)
at org.jbpm.runtime.manager.impl.jpa.EntityManagerFactoryManager.getOrCreate(EntityManagerFactoryManager.java:33)
at org.jbpm.runtime.manager.impl.DefaultRuntimeEnvironment.init(DefaultRuntimeEnvironment.java:73)
at org.jbpm.runtime.manager.impl.RuntimeEnvironmentBuilder.get(RuntimeEnvironmentBuilder.java:400)
at org.jbpm.runtime.manager.impl.RuntimeEnvironmentBuilder.get(RuntimeEnvironmentBuilder.java:74)</blockquote>
I Create my runtime environement the following way,
RuntimeEnvironment environment = RuntimeEnvironmentBuilder.Factory.get()
.newDefaultInMemoryBuilder()
.persistence(false)
.addAsset(ResourceFactory.newClassPathResource("examples/helloworld.bpmn2.xml"), ResourceType.BPMN2)
.addAsset(ResourceFactory.newClassPathResource("examples/newBPMNProcess.bpmn"), ResourceType.BPMN2)
.get();
As my understanding is that persistence should be disabled by default, I don't see what I'm doing wrong. It could be linked to something included in some of my dependencies, but I don't have found anything on it either.
Has anybody faced the same issue already or has any advice.
Thanks

A RuntimeManager is a combination of a process engine and a human task service. The human task service needs persistence (to start the human tasks etc.), that's why it's still asking for a datasource, even if you configure the engine to not use persistence.
If you want to use an engine without our human task service, you don't need persistence at all, but I wouldn't use a RuntimeManager in that case, simply create a ksession from the kbase directly:
http://docs.jboss.org/jbpm/v6.1/userguide/jBPMCoreEngine.html#d0e1805

The InMemoryBuilder which you use in your code is supposed to (as per API documentation) not be persistent, but it is actually adding a persistence manager to the environment, just with an InMemoryMapper instead of a JPAMapper because of the way the init() method in DefaultRuntimeEnvironment is implemented:
public void init() {
if (emf == null && getEnvironmentTemplate().get(EnvironmentName.CMD_SCOPED_ENTITY_MANAGER) == null) {
emf = EntityManagerFactoryManager.get().getOrCreate("org.jbpm.persistence.jpa");
}
addToEnvironment(EnvironmentName.ENTITY_MANAGER_FACTORY, emf);
if (this.mapper == null) {
if (this.usePersistence) {
this.mapper = new JPAMapper(emf);
} else {
this.mapper = new InMemoryMapper();
}
}
}
As you can see above, this still tries to getOrCreate() a persistence unit (I have seen a better implementation which also checks for the value of persistence attribute somewhere, but the issue here is, DefaultRuntimeEnvironment doesn't do that).
What you need to start with to get away without persistence is a newEmptyBuilder():
RuntimeEnvironment env = RuntimeEnvironmentBuilder.Factory.get()
.newEmptyBuilder()
.knowledgeBase(KieServices.Factory.get().getKieClasspathContainer().getKieBase("my-knowledge-base"))
// ONLY REQUIRED FOR PER-REQUEST AND PER-INSTANCE STRATEGY
//.addEnvironmentEntry("IS_JTA_TRANSACTION", false)
.persistence(false)
.get();
Do mind though that this will only work for Singleton runtime managers - PerProcessInstance and PerRequest expect to be able to suspend a running transaction if necessary, which is only possible if you have an entity manager to be able to persist state.
For testing with those two strategies also use addEnvironmentEntry() above.

Best practice of managing database connections in a web app

I am developing a MongoDB app with Java but I think this question related to datastore connections for web apps in general.
I like to structure all web apps with four top-level packages which are called (which I think will be self explanatory):
Controller
Model
Dao
Util
Ideally I would like to have a class in the Dao package that handles all the connections details.
So far I have created a class that looks like this:
public class Dao {
public static Mongo mongo;
public static DB database;
public static DB getDB() throws UnknownHostException, MongoException{
mongo = new Mongo("localhost");
database = mongo.getDB("mydb");
return database;
}
public static void closeMongo(){
mongo.close();
}
}
I use it in my code with something like this
public static void someMethod(String someData){
try {
DB db = Dao.getDB();
DBCollection rColl = db.getCollection("mycollection");
// perform some database operations
Dao.closeMongo();
} catch (UnknownHostException e) { e.printStackTrace(); } catch (MongoException e) { e.printStackTrace();
}
}
This seems to work fine, but I'd be curious to know what people think is the "best" way to handle this issue, if there is such a thing.

The rule of thumb when connecting to relational database server is to have a pool. For example if you connect to an oracle database using a pool gives you some performance benefits both in terms of connection setup time and sql parsing time (if you are using bind variables). Other relational database may vary but my opinion is that a pool is a good pattern even for some other reason (eg. you may want to limit the maximum number of connections with your db user). You are using MongoDB so the first thing to check is how MongoDB handles connections, how expnsive is creating a connection,etc. I suggest to use/build a class that can implements a pool logic because it gives you the flexibility you may need in the future. Looking at your code it seems that you api
DB db=Dao.getDB();
should be paired with:
Dao.closeDB(DB db);
So you have a chance to really close the connection or to reuse it without affecting the Dao code. with these two methods can switch the way you manage connections without recoding the Dao objects

I would suggest you can write a java class to establish the connection with the database.
The arguments to the method should be the database name, password, host port and other necessary credentials.
You can always call the parametrized constructor everywhere where there is a need to establish database connectivity. This can be a model.

I got a 'nice' solution from this article. http://www.lennartkoopmann.net/post/722935345
Edit Since that link is dead, here's one from waybackmachine.org
http://web.archive.org/web/20120810083748/http://www.lennartkoopmann.net/post/722935345
Main Idea
What I found interesting was the use of a static synchronised method that returns an instance of the static class and its variables. Most professional devs probably find this obvious. I found this to be a useful pattern for managing the db connections.
Pooling
Mongo does automatic connection pooling so the key is to use just one connection to the datastore and let it handle its own pooling.

I think it is better if you call a method inside DAO to get data from database as well. If you do it in this way, say your database got changed. Then you have to edit many classes if you get data directly calling db queries. So if you separate db calling methods inside the DAO class itself and call that method to get data it is better.

Database cleanup after Junit tests

I have to test some Thrift services using Junit. When I run my tests as a Thrift client, the services modify the server database. I am unable to find a good solution which can clean up the database after each test is run.
Cleanup is important especially because the IDs need to be unique which are currently read form an XML file. Now, I have to manually change the IDs after running tests, so that the next set of tests can run without throwing primary key violation in the database. If I can cleanup the database after each test run, then the problem is completely resolved, else I will have to think about other solutions like generating random IDs and using them wherever IDs are required.
Edit: I would like to emphasize that I am testing a service, which is writing to database, I don't have direct access to the database. But since, the service is ours, I can modify the service to provide any cleanup method if required.

If you are using Spring, everything you need is the #DirtiesContext annotation on your test class.
#RunWith(SpringJUnit4ClassRunner.class)
#ContextConfiguration("/test-context.xml")
#DirtiesContext(classMode = ClassMode.AFTER_EACH_TEST_METHOD)
public class MyServiceTest {
....
}

Unless you as testing specific database actions (verifying you can query or update the database for example) your JUnits shouldn't be writing to a real database. Instead you should mock the database classes. This way you don't actually have to connect and modify the database and therefor no cleanup is needed.
You can mock your classes a couple of different ways. You can use a library such as JMock which will do all the execution and validation work for you. My personal favorite way to do this is with Dependency Injection. This way I can create mock classes that implement my repository interfaces (you are using interfaces for your data access layer right? ;-)) and I implement only the needed methods with known actions/return values.
//Example repository interface.
public interface StudentRepository
{
public List<Student> getAllStudents();
}
//Example mock database class.
public class MockStudentRepository implements StudentRepository
{
//This method creates fake but known data.
public List<Student> getAllStudents()
{
List<Student> studentList = new ArrayList<Student>();
studentList.add(new Student(...));
studentList.add(new Student(...));
studentList.add(new Student(...));
return studentList;
}
}
//Example method to test.
public int computeAverageAge(StudentRepository aRepository)
{
List<Student> students = aRepository.GetAllStudents();
int totalAge = 0;
for(Student student : students)
{
totalAge += student.getAge();
}
return totalAge/students.size();
}
//Example test method.
public void testComputeAverageAge()
{
int expectedAverage = 25; //What the expected answer of your result set is
int actualAverage = computeAverageAge(new MockStudentRepository());
AssertEquals(expectedAverage, actualAverage);
}

How about using something like DBUnit?

Spring's unit testing framework has extensive capabilities for dealing with JDBC. The general approach is that the unit tests runs in a transaction, and (outside of your test) the transaction is rolled back once the test is complete.
This has the advantage of being able to use your database and its schema, but without making any direct changes to the data. Of course, if you actually perform a commit inside your test, then all bets are off!
For more reading, look at Spring's documentation on integration testing with JDBC.

When writing JUnit tests, you can override two specific methods: setUp() and tearDown(). In setUp(), you can set everything thats necessary in order to test your code so you dont have to set things up in each specific test case. tearDown() is called after all the test cases run.
If possible, you could set it up so you can open your database in the setUp() method and then have it clear everything from the tests and close it in the tearDown() method. This is how we have done all testing when we have a database.
Heres an example:
#Override
protected void setUp() throws Exception {
super.setUp();
db = new WolfToursDbAdapter(mContext);
db.open();
//Set up other required state and data
}
#Override
protected void tearDown() throws Exception {
super.tearDown();
db.dropTables();
db.close();
db = null;
}
//Methods to run all the tests

Assuming you have access to the database: Another option is to create a backup of the database just before the tests and restore from that backup after the tests. This can be automated.

If you are using Spring + Junit 4.x then you don't need to insert anything in DB.
Look at
AbstractTransactionalJUnit4SpringContextTests class.
Also check out the Spring documentation for JUnit support.

It's a bit draconian, but I usually aim to wipe out the database (or just the tables I'm interested in) before every test method execution. This doesn't tend to work as I move into more integration-type tests of course.
In cases where I have no control over the database, say I want to verify the correct number of rows were created after a given call, then the test will count the number of rows before and after the tested call, and make sure the difference is correct. In other words, take into account the existing data, then see how the tested code changed things, without assuming anything about the existing data. It can be a bit of work to set up, but let's me test against a more "live" system.
In your case, are the specific IDs important? Could you generate the IDs on the fly, perhaps randomly, verify they're not already in use, then proceed?

I agree with Brainimus if you're trying to test against data you have pulled from a database. If you're looking to test modifications made to the database, another solution would be to mock the database itself. There are multiple implementations of in-memory databases that you can use to create a temporary database (for instance during JUnit's setUp()) and then remove the entire database from memory (during tearDown()). As long as you're not using an vendor-specific SQL, then this is a good way to test modifying a database without touching your real production one.
Some good Java databases that offer in memory support are Apache Derby, Java DB (but it is really Oracle's flavor of Apache Derby again), HyperSQL (better known as HSQLDB) and H2 Database Engine. I have personally used HSQLDB to create in-memory mock databases for testing and it worked great, but I'm sure the others would offer similar results.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.