Connecting to multiple databases efficiently - java

Referring to similar question :
Pattern for connecting to different databases using JDBC
I am using different Connection Strings/Drivers for each database.This is what I am doing, not very sure if it's the most efficient way to do it:
Create separate classes for each db's Connection with a getConnection(String URl,String userid,String password) method in it
In main class get connection object for DB1,DB2,DB3, open connections
Fetch data from DB1, write it to a flat file, repeat for DB2 and DB3
Close all three connections.
NOTE:I read about using Spring/Hibernate/DataSources/ConnectionPooling Dont know what shoud be the best option

The way I understand it is that you want your application to run some (SELECT?) queries on different databases and dump the results. I presume this is a part of a larger application since otherwise you would probably get results quicker by simply writing a command-line script that automates the client tools for the specific databases.
Hibernate, Data Sources (in the Java DataSource object sense) and Connection Pooling won't solve your problem - I guess it's the same for Spring but I don't know which part of Spring you're referring to. The reason for this is that they all are designed to abstract over a single (or a pool/collection of connections) to a single database - connection pooling simply allows you to keep a pool of ready-to-use (TCP) connections to a given database in order to improve performance, for example by avoiding connection and authentication overhead. Hibernate does the same in the sense that it abstracts a connection to a single database (and can use connection pooling for performance reasons on top of that).
I would suggest to maybe take a different approach to thinking about your problem:
Since you want to run some queries on some datasource and write the results to some destination, why don't you start your design this way: Come up with an interface/class DataExtractionTask that requires a database connection, a set of queries to run and some output stream. Instead of using java.sql.Connection directly you could choose some framework to make your life easier, there are heavy-weights like Hibernate and light-weights like jdbi. Then come up with code that establishes your database connection, decides which queries to run and the outputs to write to and feed all of that into your thought-out DataExtractionTask to run the logic of processing (orchestrating the individual parts).
Once you have the basic stuff in place you can add other features on top of it, you could make it configurable, you could choose to run multiple DataExtractionTasks in parallel instead of sequentially, et cetera.
This way you can generalize the processing logic and then focus on getting everything (database connections, query definitions, etc.) ready for processing. I realize that this is very broad-picture but maybe it makes things a bit easier.
Regarding efficiency: If you mean high performance (relative terms!), the best way would be what #Elliott Frisch wrote -- keeping it all in a single database that you connect to using a single connection pool.

You don't need to use separate classes just for connecting, just build up a util class which holds all the JDBC URLs and obtain a connection from it.
Besides that, you should consider using JPA instead, which you can do as well in Java SE as in Java EE. With that, you can abstract from the low level connection and define a named datasource. See for example this Oracle tutorial.

Related

Considerations when calling mysql database in parallel

I have to create an mysql database to be used by several applications in parallel for the first time. Up until this point my only experience with mysql databases have been single programs (for example webservers) querying the database.
Now i am moving into a scenario where i will have several CXF java servlet type programs, as well as a background server editing and reading on the same schemas.
I am using the Connector/J JDBC driver to connect to the database in all instances.
My question is this: What do i need to do in order to make sure that the parallel access does not become a problem. I realize that i need to use transactions where appropriate, but where i am truly lost is in the management.
For example.
Do i need to close the connection every time a servlet is done with a job?
Do i need a unique user for each program accessing the database?
Do i have to do something with my Connector/J objects?
Do i have to declare my tables in a different way?
Did i miss anything or is there something i failed to think about?
I have a pretty good idea about how to handle transactions and the SQL itself, but i am pretty lost when it comes to what i need to do when setting up my database.
You should maintain a pool of connections. Connections are really expensive to create think on the order of of several hundred milliseconds. So for high volume apps it makes sense to cache and reuse them.
For your servlet it depends on what container you are using. Something like JBoss will provide pooling as part of the container. It can be defined through the datasource definition and accessed through JNDI. Other containers like tomcat may rely on something like C3PO.
Most of these frameworks return custom implementations of JDBC connections that implement the close() methods with logic that returns the connection to the pool. You should familiarize yourself with the details of your concrete implementation to make sure you are doing things in a way that is supported
As for the concurrency considerations, you should familiarize yourself with concepts of optimistic/pessimistic locking and transaction isolation levels. These have trade offs where the correct answer can only be determined given the operational context of your application.
Considering the user, Most applications have one user that represents the application called the read/write user. This user should only have privilege to read and write records from the tables,indices,sequences, etc. that are associated with your application. All the instances of the application will specify this user in their connection string.
If you familiarize yourself with the concepts above, you'll be about 95% of the way there.
One more thing. As pointed out in the comments on the administration side your database engine is a huge consideration. You should familiarize yourself with the differences and the tuning/configuration options.

For moderately complex Java Desktop app; should JavaDB connection be static?

I am writing a moderately complex Java desktop app, including an embedded database. I do not see any reason why, after the app establishes a connection to the database, why it should close the connection until the app is going to shut down.
Practically everything one does with the database requires a connection; transactions can be started and completed serially in the connection, the app is not doing anything fantastically complicated with the database.
Is there any reason why I should not create the connection and put a reference to it in a static variable in a class known and used by database-specific classes? It would save having the connection have to be passed around among all kinds of methods without ever changing value.
Is there a design-level consideration I'm missing somewhere?
rc
I would suggest using a library such as c3p0 or dbcp which handles connection pooling for you. It gives you the flexibility to scale up your application later if necessary.
Anything static usually makes it harded to write proper test cases, since you never know if the static resource has been altered or not.
Three months down the road, you're going to want to be able to connect to two databases at the same time - maybe you're doing some import / export work, or an upgrade job, or merging two customers together. And then you're going to want two of them. And now suddenly that static field everyone uses is a nightmare.
You could look into an IoC container like Guice or Spring to ensure that you can keep track of "singleton" objects without abusing static fields to enforce their "Singleton"ness.
Avoid statics. Think on concurrency and multithread issues with this kind of variables. A good point is handle your connections with a database pool. Spring is your friend to reach a simple and nice configuration
I do not see any reason why, after the app establishes a connection to the
database, why it should close the connection until the app is going to shut down.
That seems completely fine to me. It's an embedded database; it is at the service
of your application. Create the connection when you start, use it as long as you
need, shut it down when your application closes down.

SQL server stub for java

I have a java application that is using MSSQL server through the JDBC driver. Is there some kind of stub that I can use for testing? For example I want to test how my application handle cases of connection errors, SQL server out of disk, and other exceptions. It's pretty hard and complex to simulate this with real SQL server.
Thanks
You could write unit tests against your DAOs or repositories returning mock Connection objects using a mock library such as https://mocquer.dev.java.net/.
You'd need a really clean and decoupled application architecture though in order to make this work correctly and provide you with actual test coverage.
You could (assuming the system is architected in a way to make this easy) create your own versions of the DB Access classes (I assume you are using teh statement/preparedstatement interfaces), which would hold the real DB calls and that you can modify to do exactly what you want.
I've done this - it takes a day or so of really boring work.
I don't think there's something like that.
You'd be better off setting up your own database and testing on your machine/lan.
All I know there is out there, is:
freeSQL
db4free
Both support MySQL, but none MS-SQL. I do think that has to do with licensing issues and limitations. So I'm afraid you won't find a similar service for MS-SQL db.
Answering myself with an option I thought of, I'll be glad to hear your inputs on it.
After crawling around, I got to HyperSQLDB, a java-implemented database.
How feasible do you think is to take the source code of HSQLDB, and adding another layer to it, so I can control it and inject pre-defined behaviors to it.
For example, I'll make it run all queries slowly, I'll make it disconnect, etc.
Do you think this idea is worth pursuing? Is it doable in a reasonable amount of time?
If you use something other than MS-SQL, you may cause more testing problems due to incompatibilities and lack of functionality (e.g., transactions) than you solve. So I'm with Carl - use a shim.
If you were looking for unit-test coverage of ordinary behavior, I might think differently.
I haven't used them personally, but the stuff you're talking about sounds like a really good fit for a mocking framework, such as Mockito(docs) or PowerMock. They appear to provide good support for the kind of failure injection you're after. Can someone with experience with either of them (or similar) weigh in? See also How to stub/mock JDBC ResultSet to work both with Java 5 and 6?
execute procedure sp_who2 it will generate the all the current connections and process in your db you can see a column named spid corresponding to each db connection. just type: kill <<spid>> and execute it to terminate any users..etc. but if the spid is less than 50 it means it is a system process and dont kill it. This can help you replicate connection drops.
you can also say ALTER DATABASE dbname SET SINGLE_USER WITH ROLLBACK_IMMEDIATE this will drop all connections to the said db immediately.
Select ##MAX_Connections as Max_Connections would give you the max connections which can be made to a database (you can set it to a low number to test connection unavailability).
to replicate query timeout.. set the query timeout to a very low number & execute a fairly large query.
to create disk space error, simply redice the size of the db file & do not allow it to grow... then insert data to the database (you'll get an exception).
altert database xxx (file= maxsize= filegrowth=)

How to implement the "Shared database, separate schema" multi-tenant strategy

I have to make a web application multi-tenant enabled using Shared database separate schema approach. The application is built using Java/J2EE and Oracle 10g.
I need to have one single appserver using a shared database with multiple schema, one schema per client.
What is the best implementation approach to achieve this?
What needs to be done at the middle tier (app-server) level?
Do I need to have multiple host headers each per client?
How can I connect to the correct schema dynamically based on the client who is accessing the application?
At a high level, here are some things to consider:
You probably want to hide the tenancy considerations from day-to-day development. Thus, you will probably want to hide it away in your infrastructure as much as possible and keep it separate from your business logic. You don't want to be always checking whether which tenant's context you are in... you just want to be in that context.
If you are using a unit of work pattern, you will want to make sure that any unit of work (except one that is operating in a purely infrastructure context, not in a business context) executes in the context of exactly one tenant. If you are not using the unit of work pattern... maybe you should be. Not sure how else you are going to follow the advice in the point above (though maybe you will be able to figure out a way).
You probably want to put a tenant ID into the header of every messaging or HTTP request. Probably better to keep this out of the body on principle of keeping it away from business logic. You can scrape this off behind the scenes and make sure that behind the scenes it gets put on any outgoing messages/requests.
I am not familiar with Oracle, but in SQL Server and I believe in Postgres you can use impersonation as a way of switching tenants. That is to say, rather than parameterizing the schema in every SQL command and query, you can just have one SQL user (without an associated login) that has the schema for the associated tenant as its default schema, and then leave the schema out of your day-to-day SQL. You will have to intercept calls to the database and wrap them in an impersonation call. Like I say, I'm not exactly sure how this works out in Oracle, but that's the general idea for SQL Server.
Authentication and security are a big concern here. That is far beyond the scope of what I can discuss in this answer but make sure you get that right.

Java with database storage

I have to write a program that can take bookings, store them and then access them at a later time, the application has to be written in Java. Because of this i have been looking into various ways to use a database with Java.
I have been looking into using the JDBC with the mySQL driver database and also looking into the javaDB. What would you recommend that i do to create this program, Has anyone have any experience of writing a program that uses a DB in java and could give me any tips?
Thanks!
I'm sure every Java developer has written an application that talks to a database using JDBC. Many people stop using raw JDBC fairly quickly either using something like Spring JDBC wrapper or a full blown ORM such as Hibernate. All these will make writing the database layer that little bit easier but I generally feel you should have a reasonable understanding of what is going on under the hood before diving into them.
So the in-process or dedicated database question? It depends. How many people will be accessing the application at once? (An in-process database would be useless if you need multiple people accessing the same database). How much control over the environment do you have? (it is a waste of time suggesting a dedicated database if you can't install it). Is the application web based or a desktop application? (A web based solution could use an in-process database and allow multiple users to access it). How important is the data? (A in-process database is less likely to have the same level of back up solutions as a dedicated database).
In your case, I'd use Java6 and Java DB (aka Derby). It's not that installing and using MySQL is complicated (its quite simple actually) but well, why would you do that if you already have a capable database?
Having that said, to get started with JavaDB, have a look at the Java DB Reference, there are plenty of technical articles there. Pay a special attention to Working with the Java DB (Derby) Database - NetBeans IDE 6.5 Tutorial.
For the data access itself, you could use the Java Persistence API (JPA) as in Creating a Custom Java Desktop Database Application. But I wouldn't do that for homework. Instead, I'd start with the basics i.e. with JDBC and do everything by hand. You mentioned it and I think its a good idea. Have a look at Tutorial: Java databasing with Derby, Java's own open source database, it might be very useful (I'm not saying the code shown there promotes all best practices but it's pretty simple and will get you started).
Don't pollute your mind with advanced topics such as connection pooling, don't use frameworks like Hibernate, JPA or even Spring (these frameworks know how to do things, you don't and the point is not to learn using frameworks, at least not right now). Keep it simple and sexy but do it by hand.
Sun has two JDBC tutorials online that you might find useful, JDBC Introduction and JDBC Basics. These are part of a much larger series of Java Tutorials that have a lot of great information. They're usually the first place I look when I run into something I don't know how to do in Java.
One really simple approach is to use Spring's JDBC classes, which provide a lot of convenience methods over raw JDBC and automatically manage the releasing of resources. Spring also provides a rich exception hierarchy, which allows specific actions to be taken based on specific errors (e.g. retry on database deadlock).
Also, unlike many other persistence layers Spring does not hide the JDBC implementation under the covers (it acts as a thin layer), allowing you to use raw JDBC if necessary.
Initialisation
// First create a DataSource corresponding to a single connection.
// Your production application could potentially use a connection pool.
// true == suppress close of underlying connection when close() is called.
DataSource ds = new SingleConnectionDataSource("com.MyDriver", "Database URL", "user", "password", true);
// Now create a SimpleJdbcTemplate passing in the DataSource. This provides many
// useful utility methods.
SimpleJdbcTemplate tmpl = new SimpleJdbcTemplate(ds);
SQL Update (or Insert)
// Simple example of SQL update. The resources are automatically release if an exception
// occurs. SQLExceptions are translated into Spring's rich DataAccessException exception
// hierarchy, allowing different actions to be performed based on the specific type of
// error.
int nRows = tmpl.update("update Foo set Name = ? where Id = ?", "Hello", 5);
assert nRows == 1;
SQL Query
// Example of SQL query using ParameterizedRowMapper to translate each row of the
// ResultSet into a Person object.
tmpl.query("select * from Person", new ParameterizedRowMapper<Person>() {
Person mapRow(ResultSet rs, int rowNum) {
return new Person(rs.getString("FirstName"), rs.getString("LastName"));
}
});
Check this examples: http://www.roseindia.net/jdbc/jdbc-mysql/. A lot of code samples about using JDBC with MySQL.
I can certainly recommend JavaDb (the database that comes with Java 6). It was original the Derby database from IBM and is very capable. It integrates well with a Java program (either running as a standalone process, or in the same VM as the client program). You won't have to worry about an additional non_Java install (which may be important, depending on how/where you're deploying).
If you want/need an ORM (object-relational mapping), Hibernate is the de facto Java standard these days (here's an introduction). However, if that's overkill for your project, at least check out Apache Commons DbUtils, which will reduce a lot of your boilerplate code.

Categories