I am developing a MongoDB app with Java but I think this question related to datastore connections for web apps in general.
I like to structure all web apps with four top-level packages which are called (which I think will be self explanatory):
Controller
Model
Dao
Util
Ideally I would like to have a class in the Dao package that handles all the connections details.
So far I have created a class that looks like this:
public class Dao {
public static Mongo mongo;
public static DB database;
public static DB getDB() throws UnknownHostException, MongoException{
mongo = new Mongo("localhost");
database = mongo.getDB("mydb");
return database;
}
public static void closeMongo(){
mongo.close();
}
}
I use it in my code with something like this
public static void someMethod(String someData){
try {
DB db = Dao.getDB();
DBCollection rColl = db.getCollection("mycollection");
// perform some database operations
Dao.closeMongo();
} catch (UnknownHostException e) { e.printStackTrace(); } catch (MongoException e) { e.printStackTrace();
}
}
This seems to work fine, but I'd be curious to know what people think is the "best" way to handle this issue, if there is such a thing.
The rule of thumb when connecting to relational database server is to have a pool. For example if you connect to an oracle database using a pool gives you some performance benefits both in terms of connection setup time and sql parsing time (if you are using bind variables). Other relational database may vary but my opinion is that a pool is a good pattern even for some other reason (eg. you may want to limit the maximum number of connections with your db user). You are using MongoDB so the first thing to check is how MongoDB handles connections, how expnsive is creating a connection,etc. I suggest to use/build a class that can implements a pool logic because it gives you the flexibility you may need in the future. Looking at your code it seems that you api
DB db=Dao.getDB();
should be paired with:
Dao.closeDB(DB db);
So you have a chance to really close the connection or to reuse it without affecting the Dao code. with these two methods can switch the way you manage connections without recoding the Dao objects
I would suggest you can write a java class to establish the connection with the database.
The arguments to the method should be the database name, password, host port and other necessary credentials.
You can always call the parametrized constructor everywhere where there is a need to establish database connectivity. This can be a model.
I got a 'nice' solution from this article. http://www.lennartkoopmann.net/post/722935345
Edit Since that link is dead, here's one from waybackmachine.org
http://web.archive.org/web/20120810083748/http://www.lennartkoopmann.net/post/722935345
Main Idea
What I found interesting was the use of a static synchronised method that returns an instance of the static class and its variables. Most professional devs probably find this obvious. I found this to be a useful pattern for managing the db connections.
Pooling
Mongo does automatic connection pooling so the key is to use just one connection to the datastore and let it handle its own pooling.
I think it is better if you call a method inside DAO to get data from database as well. If you do it in this way, say your database got changed. Then you have to edit many classes if you get data directly calling db queries. So if you separate db calling methods inside the DAO class itself and call that method to get data it is better.
Related
Im creating a simple DBHelper for my postgre DB using a JDBC driver.
Im wondering what are the best practices?
For example, are methods like initConnection() closeConnection() or any other, should be static one? Like:
void foo{
DBHelper.initConnection();
// do some logic, maybe:
// Data someData = DBHelper.getSomeData();
DBHelper.closeConnection();
}
Or maybe better if i will create a DBHelper object and call method for object. Like:
void foo2{
DBHelper dbhelper = new DBHelper();
dbhelper.initConnection();
// do some logic, maybe:
// Data someData = dbhelper.getSomeData();
dbhelper.closeConnection();
}
Is it matter at all?
Do i need always check if connection is open before i will try to retrive some data? What if it is close? And always try to close it in finally block?
EDIT:
in answer to #Kayaman comment:
So my foo method like this?
void foo3{
Connection conn = DBHelper.getConnection();
// do some logic, maybe:
// Statement statement = conn.createStatement();
// some stmt work
conn.close() //do i need check if stmt is closed before?
}
That will make my DBHelper class usefull only to getting connection. There will be no logic inside? (like GetInterestingRecords() or GetRecordsWithId(30) ?
Have you thought about defining the connection properties in the server config file (if it is a web app) and have the session opened for the whole application lifecycle?
Before implementing DBHelper you should check if some java libraries may satisfy your needs. If you take a look at this there are listed some libraries that seem to fit your problem.
If you decide to go on with your own custom implementation I suggest to make DBHelper a normal class with no static methods for managing the connections; the main reason is that with static methods you cannot manage multiple (i.e. connections to different databases) db connections at the same time. If you are using a java 7 implementation in your onw library you could also implement tha AutoClosable inferface in order to better manage the resource you library is managing.
I have read about DAO from here and I find it really interesting but a few things are still missing me.
I would like to use the Interface to implement for two different data sources - one is a Socket connection, the other a Database connection.
For this I do the following:
public class databasePartDAOImplementation extends Database implements PartDAO {
//implementation
}
and the Database class has a constructor and some methods for managing this connection. How can I set up these classes, so I could instantiate one Database connection and then uses multiple DAOs all using this one connection?
I'm thinking of creating and instance of Database() and casting it into all the DAOs when needed, but I am not sure of any downfalls to this.
The Database class I use looks like this
public class Database
{
protected Connection connection;
public Database() throws ClassNotFoundException
{
Class.forName("org.sqlite.JDBC");
connection = null;
try
{
connection = DriverManager.getConnection("jdbc:sqlite:database.s3db");
}
catch(SQLException e)
{
System.err.println(e.getMessage());
}
}
public Connection getConnection() {
return connection;
}
}
tl;dr Would using this be a good idea of accessing the db?
Database db = new Database();
databasePartDAOImplementation dao = (databasePartDAOImplementation) db;
dao.getAllRecords();
You are partially correct regarding the point that the DAO should manage the connection with its data source to obtain and store data. But, as you have asked, how can we have all DAOs share the same connection? Let me shed some light by considering of the following strategies and then things will fall into place:
Automatically generating DAO code: Normally, there exists a relationship between your business objects, their DAOs and their underlying DB tables. Automatic code generation can take place using that relationship. In complex cases, you may use third party tools for code generation. May not be what you are looking for, but I just wanted to put it out there.
DAO Factory:
In a scenario where you are not dealing with different data sources (apparently this is not your scenario), you would create a single DAO factory class and use the Factory Method pattern to create the different DAOs used by your application. The DAO factory would be the class to manage the connection to the data source in this case.
In the scenario where you are dealing with different data sources (this is most likely your scenario), you would create an abstract DAO factory (this is the Abstract Factory pattern). Then, again, using the Factory Method pattern, your abstract DAO factory creates the concrete DAO factories for each of your specific data sources. Each concrete DAO factory would be responsible of two main tasks:
Creating a DAO for each kind of data access
Implementing a static method (e.g. createConnection) that takes care of creating a connection with its specific data source. You should consider connection pooling implementation and usage for that matter.
All the DAOs that are created by a concrete DAO factory would then be able to call that static method (createConnection) to obtain a connection to the specific data source. This connection would essentially be the same connection across all DAOs of the concrete DAO factory.
Ideas presented in this answer are based on the detailed article Core J2EE Patterns - Data Access Object.
In my desktop application new databases get opened quite often. I use Hibernate/JPA as an ORM.
The problem is, creating the EntityManagerFactory is quite slow, taking about 5-6 Seconds on a fast machine. I know that the EntityManagerFactory is supposed to be heavyweight but this is just too slow for a desktop application where the user expects the new database to be opened quickly.
Can I turn off some EntityManagerFactory features to get an instance
faster? Or is it possible to create some of the EntityManagerFactory lazily to speed up cration?
Can I somehow create the EntityManagerFactory object before
knowing the database url? I would be happy to turn off all
validation for this to be possible.
By doing so, can I pool EntityManagerFactorys for later use?
Any other idea how to create the EntityManagerFactory faster?
Update with more Information and JProfiler profiling
The desktop application can open saved files. Our application document file format constists of 1 SQLite database + and some binary data in a ZIP file. When opening a document, the ZIP gets extracted and the db is opened with Hibernate. The databases all have the same schema, but different data obviously.
It seems that the first time I open a file it takes significantly longer than the following times.
I profiled the first and second run with JProfiler and compared the results.
1st Run:
create EMF: 4385ms
build EMF: 3090ms
EJB3Configuration configure: 900ms
EJB3Configuration <clinit>: 380ms
.
2nd Run:
create EMF: 1275ms
build EMF: 970ms
EJB3Configuration configure: 305ms
EJB3Configuration <clinit>: not visible, probably 0ms
.
In the Call tree comparison you can see that some methods are significantly faster (DatabaseManager. as starting point):
create EMF: -3120ms
Hibernate create EMF: -3110ms
EJB3Configuration configure: -595ms
EJB3Configuration <clinit>: -380ms
build EMF: -2120ms
buildSessionFactory: -1945ms
secondPassCompile: -425ms
buildSettings: -346ms
SessionFactoryImpl.<init>: -1040ms
The Hot spot comparison now has the interesting results:
.
ClassLoader.loadClass: -1686ms
XMLSchemaFactory.newSchema: -184ms
ClassFile.<init>: -109ms
I am not sure if it is the loading of Hibernate classes or my Entity classes.
A first improvement would be to create an EMF as soon as the application starts just to initialize all necessary classes (I have an empty db file as a prototype already shipped with my Application). #sharakan thank you for your answer, maybe a DeferredConnectionProvider would already be a solution for this problem.
I will try the DeferredConnectionProvider next! But we might be able to speed it up even further. Do you have any more suggestions?
You should be able to do this by implementing your own ConnectionProvider as a decorator around a real ConnectionProvider.
The key observation here is that the ConnectionProvider isn't used until an EntityManager is created (see comment in supportsAggressiveRelease() for a caveat to that). So you can create a DeferredConnectionProvider class, and use it to construct the EntityManagerFactory, but then wait for user input, and do the deferred initialization before actually creating any EntityManager instances. I'm written this as a wrapper around ConnectionPoolImpl, but you should be able to use any other implementation of ConnectionProvider as the base.
public class DeferredConnectionProvider implements ConnectionProvider {
private Properties configuredProps;
private ConnectionProviderImpl realConnectionProvider;
#Override
public void configure(Properties props) throws HibernateException {
configuredProps = props;
}
public void finalConfiguration(String jdbcUrl, String userName, String password) {
configuredProps.setProperty(Environment.URL, jdbcUrl);
configuredProps.setProperty(Environment.USER, userName);
configuredProps.setProperty(Environment.PASS, password);
realConnectionProvider = new ConnectionProviderImpl();
realConnectionProvider.configure(configuredProps);
}
private void assertConfigured() {
if (realConnectionProvider == null) {
throw new IllegalStateException("Not configured yet!");
}
}
#Override
public Connection getConnection() throws SQLException {
assertConfigured();
return realConnectionProvider.getConnection();
}
#Override
public void closeConnection(Connection conn) throws SQLException {
assertConfigured();
realConnectionProvider.closeConnection(conn);
}
#Override
public void close() throws HibernateException {
assertConfigured();
realConnectionProvider.close();
}
#Override
public boolean supportsAggressiveRelease() {
// This gets called during EntityManagerFactory construction, but it's
// just a flag so you should be able to either do this, or return
// true/false depending on the actual provider.
return new ConnectionProviderImpl().supportsAggressiveRelease();
}
}
a rough example of how to use it:
// Get an EntityManagerFactory with the following property set:
// properties.put(Environment.CONNECTION_PROVIDER, DeferredConnectionProvider.class.getName());
HibernateEntityManagerFactory factory = (HibernateEntityManagerFactory) entityManagerFactory;
// ...do user input of connection info...
SessionFactoryImpl sessionFactory = (SessionFactoryImpl) factory.getSessionFactory();
DeferredConnectionProvider connectionProvider = (DeferredConnectionProvider) sessionFactory.getSettings()
.getConnectionProvider();
connectionProvider.finalConfiguration(jdbcUrl, userName, password);
You could put the initial set up of the EntityManagerFactory on a separate thread or something, so that the user never has to wait for it. Then the only thing they'll wait for, after specifying the connection info, is the setting up of the connection pool, which should be fairly quick compared to parsing the object model.
Can I turn off some EntityManagerFactory features to get an instance faster?
Don't believe so. EMFs don't really have too many features, other than initializing a JDBC connection/pool.
Or is it possible to create some of the EntityManagerFactory lazily to
speed up cration?
Rather than creating the EMF lazily, when the user will notice the performance hit, I suggest you should head in the opposite direction - create the EMF proactively before the user actually needs it. Create it once, up-front, possibly in a separate thread during application initialisation (or at least as soon as you know about your database). Reuse it throughout the existence of your application/database.
Can I somehow create the EntityManagerFactory object before knowing the database url?
No - it creates a JDBC connection.
I think a better question is: why does your application dynamically discover database connection URLs? Are you saying your databases are created/made available on-the-fly and there's no way to anticipate in advance the connection parameters. That really is to be avoided.
By doing so, can I pool EntityManagerFactorys for later use?
No, you can't pool EMFs. It's the connections that you can pool.
Any other idea how to create the EntityManagerFactory faster?
I agree - 6 seconds is too slow for initialisation of EMFs.
I suspect it's more to do with your selected database technology than JPA/JDBC/JVM. My guess is that maybe your database is initialising itself as you connect. Are you using Access? What DB are you using?
Are you connecting to a database remotely located? Over a WAN? Is network speed/latency good?
Are the client PCs limited in performance?
EDIT: Added after comments
Implementing your own ConnectionProvider as a decorator around a real ConnectionProvider will not speed up the user's experience at all. The database instance still needs to be initialised, the EMF & EM created and the JDBC connection still needs to be subsequently established.
Options:
Share a common preloaded DB instance: seems not possible for your business scenario (although JSE technology supports this and also supports client-server design).
Change to a DB with a faster startup: Derby (a.k.a. Java DB) is included in modern JVMs and has a startup time of about 1.5 seconds (cold) and 0.7 seconds (warm - data pre-loaded).
In many (most?) scenarios, the fastest solution would be to load data directly into in-memory java objects using JAXB with STAX. Subsequently, use in-memory cached data (particularly using smart structures like maps, hashing and arraylists). Just as JPA can map POJO classes to database tables & columns, so JAXB can map POJO classes to XML schema & work with XML doc instances. If you have very complex queries using SQL set-based logic with multiple joins and strong use of DB indexes, this would be less desirable.
(2) would probably give the best improvement for limited effort.
Additionally:
- try to unzip the data files during deployment rather than during app usage.
- initialize the EMF in a startup thread that runs in parallel to the UI startup - try to start the DB initializing as one of the very first steps of the app (that means connecting to the actual instance using JDBC).
I am trying to use JDBI with Play 1.2.5 and im having a problem with running out of database connections. I am using the H2 in-memory database (in application.conf, db=mem)
I have created class to obtain jdbi instances that uses Play's DB.datasource like so:
public class Database {
private static DataSource ds = DB.datasource;
private static DBI getDatabase() {
return new DBI(ds);
}
public static <T> T withDatabase(HandleCallback<T> hc) {
return getDatabase().withHandle(hc);
}
public static <T> T withTransaction(TransactionCallback<T> tc) {
return getDatabase().inTransaction(tc);
}
}
Every time I do a database call, a new DBI instance is created but it always wraps the same static DataSource object (play.db.DB.datasource)
Whats happening is, after a while I am getting the following:
CallbackFailedException occured : org.skife.jdbi.v2.exceptions.UnableToObtainConnectionException: java.sql.SQLException: An attempt by a client to checkout a Connection has timed out.
I am confused because the whole point of DBI.withHandle() and DBI.withTransaction() is to close the connection and free up resources when the callback method completes.
I also tried making getDatabase() return the same DBI instance every time, but the same problem occured.
What am I doing wrong?
Duh. Turns out I was leaking connections in some old code that wasn't using withHandle(). As soon as I upgraded it the problem stopped
From the official documentation
Because Handle holds an open connection, care must be taken to ensure that each handle is closed when you are done with it. Failure to close Handles will eventually overwhelm your database with open connections, or drain your connection pool.
Turns out you are not guaranteeing the closing of the handle in your callback function whenever it is provided.
I have to test some Thrift services using Junit. When I run my tests as a Thrift client, the services modify the server database. I am unable to find a good solution which can clean up the database after each test is run.
Cleanup is important especially because the IDs need to be unique which are currently read form an XML file. Now, I have to manually change the IDs after running tests, so that the next set of tests can run without throwing primary key violation in the database. If I can cleanup the database after each test run, then the problem is completely resolved, else I will have to think about other solutions like generating random IDs and using them wherever IDs are required.
Edit: I would like to emphasize that I am testing a service, which is writing to database, I don't have direct access to the database. But since, the service is ours, I can modify the service to provide any cleanup method if required.
If you are using Spring, everything you need is the #DirtiesContext annotation on your test class.
#RunWith(SpringJUnit4ClassRunner.class)
#ContextConfiguration("/test-context.xml")
#DirtiesContext(classMode = ClassMode.AFTER_EACH_TEST_METHOD)
public class MyServiceTest {
....
}
Unless you as testing specific database actions (verifying you can query or update the database for example) your JUnits shouldn't be writing to a real database. Instead you should mock the database classes. This way you don't actually have to connect and modify the database and therefor no cleanup is needed.
You can mock your classes a couple of different ways. You can use a library such as JMock which will do all the execution and validation work for you. My personal favorite way to do this is with Dependency Injection. This way I can create mock classes that implement my repository interfaces (you are using interfaces for your data access layer right? ;-)) and I implement only the needed methods with known actions/return values.
//Example repository interface.
public interface StudentRepository
{
public List<Student> getAllStudents();
}
//Example mock database class.
public class MockStudentRepository implements StudentRepository
{
//This method creates fake but known data.
public List<Student> getAllStudents()
{
List<Student> studentList = new ArrayList<Student>();
studentList.add(new Student(...));
studentList.add(new Student(...));
studentList.add(new Student(...));
return studentList;
}
}
//Example method to test.
public int computeAverageAge(StudentRepository aRepository)
{
List<Student> students = aRepository.GetAllStudents();
int totalAge = 0;
for(Student student : students)
{
totalAge += student.getAge();
}
return totalAge/students.size();
}
//Example test method.
public void testComputeAverageAge()
{
int expectedAverage = 25; //What the expected answer of your result set is
int actualAverage = computeAverageAge(new MockStudentRepository());
AssertEquals(expectedAverage, actualAverage);
}
How about using something like DBUnit?
Spring's unit testing framework has extensive capabilities for dealing with JDBC. The general approach is that the unit tests runs in a transaction, and (outside of your test) the transaction is rolled back once the test is complete.
This has the advantage of being able to use your database and its schema, but without making any direct changes to the data. Of course, if you actually perform a commit inside your test, then all bets are off!
For more reading, look at Spring's documentation on integration testing with JDBC.
When writing JUnit tests, you can override two specific methods: setUp() and tearDown(). In setUp(), you can set everything thats necessary in order to test your code so you dont have to set things up in each specific test case. tearDown() is called after all the test cases run.
If possible, you could set it up so you can open your database in the setUp() method and then have it clear everything from the tests and close it in the tearDown() method. This is how we have done all testing when we have a database.
Heres an example:
#Override
protected void setUp() throws Exception {
super.setUp();
db = new WolfToursDbAdapter(mContext);
db.open();
//Set up other required state and data
}
#Override
protected void tearDown() throws Exception {
super.tearDown();
db.dropTables();
db.close();
db = null;
}
//Methods to run all the tests
Assuming you have access to the database: Another option is to create a backup of the database just before the tests and restore from that backup after the tests. This can be automated.
If you are using Spring + Junit 4.x then you don't need to insert anything in DB.
Look at
AbstractTransactionalJUnit4SpringContextTests class.
Also check out the Spring documentation for JUnit support.
It's a bit draconian, but I usually aim to wipe out the database (or just the tables I'm interested in) before every test method execution. This doesn't tend to work as I move into more integration-type tests of course.
In cases where I have no control over the database, say I want to verify the correct number of rows were created after a given call, then the test will count the number of rows before and after the tested call, and make sure the difference is correct. In other words, take into account the existing data, then see how the tested code changed things, without assuming anything about the existing data. It can be a bit of work to set up, but let's me test against a more "live" system.
In your case, are the specific IDs important? Could you generate the IDs on the fly, perhaps randomly, verify they're not already in use, then proceed?
I agree with Brainimus if you're trying to test against data you have pulled from a database. If you're looking to test modifications made to the database, another solution would be to mock the database itself. There are multiple implementations of in-memory databases that you can use to create a temporary database (for instance during JUnit's setUp()) and then remove the entire database from memory (during tearDown()). As long as you're not using an vendor-specific SQL, then this is a good way to test modifying a database without touching your real production one.
Some good Java databases that offer in memory support are Apache Derby, Java DB (but it is really Oracle's flavor of Apache Derby again), HyperSQL (better known as HSQLDB) and H2 Database Engine. I have personally used HSQLDB to create in-memory mock databases for testing and it worked great, but I'm sure the others would offer similar results.