I have an application that uses SQL Server. I wanted to use a NOSQL store and I decided it to be graph since my data is highly connected. Neo4j is an option.
I want optimally to be able to switch the databases without touching the application layer, say, just modifying some xml configuration files.
I've taken a look at some examples public on the web, I've seen that ORM and OGM don't configure applications the same way, the config file of each has it's own name and more importantly its own structure. Looking at the code of each revealed that they also differ in the way they initialize the session, which doesn't sound good for what I'm thinking of.
My question is: "is it possible or feasible-without-great-overhead to switch between the two databases without touching the existing application code? I may add things but not touch what exists already". It would be a great idea to establish a pure polyglot persistence between SQL and NOSQL databases, for example, using Hibernate.
I want to hear from you guys before digging deeper. Do we have one of Hibernate men with us here in SO?
The goal of Hibernate OGM is to offer an unified abstraction for various NoSQL data stores. The project is still young, as we speak, so I am not sure if you can adopt it right out-of-the-box.
There is also the problem of transactions. If your application was designed to use SQL transactions, then things will radically change when you switch to a NOSQL solution.
Using an abstraction layer is good for portability but doesn't offer all the power of native querying. That's the same problem with JP-QL, which only covers SQL-92, lacking support for window functions or CTE.
Polyglot persistence is a great feature, but try using separate repositories, like Spring Data offers. I find that much more flexible from an architectural point of view.
Referring to similar question :
Pattern for connecting to different databases using JDBC
I am using different Connection Strings/Drivers for each database.This is what I am doing, not very sure if it's the most efficient way to do it:
Create separate classes for each db's Connection with a getConnection(String URl,String userid,String password) method in it
In main class get connection object for DB1,DB2,DB3, open connections
Fetch data from DB1, write it to a flat file, repeat for DB2 and DB3
Close all three connections.
NOTE:I read about using Spring/Hibernate/DataSources/ConnectionPooling Dont know what shoud be the best option
The way I understand it is that you want your application to run some (SELECT?) queries on different databases and dump the results. I presume this is a part of a larger application since otherwise you would probably get results quicker by simply writing a command-line script that automates the client tools for the specific databases.
Hibernate, Data Sources (in the Java DataSource object sense) and Connection Pooling won't solve your problem - I guess it's the same for Spring but I don't know which part of Spring you're referring to. The reason for this is that they all are designed to abstract over a single (or a pool/collection of connections) to a single database - connection pooling simply allows you to keep a pool of ready-to-use (TCP) connections to a given database in order to improve performance, for example by avoiding connection and authentication overhead. Hibernate does the same in the sense that it abstracts a connection to a single database (and can use connection pooling for performance reasons on top of that).
I would suggest to maybe take a different approach to thinking about your problem:
Since you want to run some queries on some datasource and write the results to some destination, why don't you start your design this way: Come up with an interface/class DataExtractionTask that requires a database connection, a set of queries to run and some output stream. Instead of using java.sql.Connection directly you could choose some framework to make your life easier, there are heavy-weights like Hibernate and light-weights like jdbi. Then come up with code that establishes your database connection, decides which queries to run and the outputs to write to and feed all of that into your thought-out DataExtractionTask to run the logic of processing (orchestrating the individual parts).
Once you have the basic stuff in place you can add other features on top of it, you could make it configurable, you could choose to run multiple DataExtractionTasks in parallel instead of sequentially, et cetera.
This way you can generalize the processing logic and then focus on getting everything (database connections, query definitions, etc.) ready for processing. I realize that this is very broad-picture but maybe it makes things a bit easier.
Regarding efficiency: If you mean high performance (relative terms!), the best way would be what #Elliott Frisch wrote -- keeping it all in a single database that you connect to using a single connection pool.
You don't need to use separate classes just for connecting, just build up a util class which holds all the JDBC URLs and obtain a connection from it.
Besides that, you should consider using JPA instead, which you can do as well in Java SE as in Java EE. With that, you can abstract from the low level connection and define a named datasource. See for example this Oracle tutorial.
This question already has answers here:
Java Programming - Where should SQL statements be stored? [closed]
(15 answers)
Closed 9 years ago.
As part of my Java program, I need to do a run a lot of queries against (Oracle) database.
Currently, we create a mix SQL and Java, which (i know) is a bad bad thing.
What is a right way to handle something like this? If possible, include examples.
Thank you.
EDIT:
A bit more information about the application. It is a web application that derives content mainly from the database (it takes user input and paints content to be seen next based on what database believes to be true).
The biggest concern I have with how it's done today is that mixing Java code and a SQL queries look "out-of-place" when coupled as tightly as it is (Queries hardcoded as part of source code)
I am looking for a cleaner way to handle this situation, which would improve maintainability and clarity of the project at hand
For what you've described, incorporating an object relational mapper (ORM) or rewriting as stored procedures is probably more work than you want to embrace. Both have non-trivial learning curves.
Instead a good practice is consolidating SQL in a class per table or purpose. Take a look at the table data gateway object and the data access object design patterns to see how this is done in practice.
The upshot of this approach is myriad. You are better positioned for reuse because queries are in one spot. Client code becomes more readable as you replace several lines of JDBC and SQL with a method call (e.g. userTableDataGateway.getContentToShow(pageId)). Finally, this will help you see the problem more clearly an ORM helps solve.
Well, one thing you could consider is an Object Relational Mapper (for example, Hibernate). This would allow you to map your database schema to Java objects, which would generally clean up your Java code.
However, if performance and speed is of the essence, you might be better off using a plain JDBC driver.
This would of course also be dependent upon the task your application is trying to accomplish. If, for example, you need to do batch updates based on a CSV file, I migh go with a pure JDBC solution. If you're designing a web application, I would definitely go with an ORM solution.
Also, note that a pure JDBC solution would involve having SQL in your Java code. Actually, for that matter, you would have to have some form of SQL, be it HQL, JPQL, or plain SQL, in any ORM solution as well. Point being, there's nothing wrong with some SQL in your Java application.
Edit in response the OP's edits
If I were writing a web application from scratch, I would use an ORM. However, since you already have a working application, making the transition from a pure JDBC solution to an ORM would be pretty painful. It would clean up your code, but there is a significant learning curve involved and it takes quite a bit of set-up. Some of the pain from setting-up would be alleviated if you are working with some sort of bean-management system, like Spring, but it would still be pretty significant.
It would also depend on where you want to go with your application. If you plan on maintaining and adding to this code for a significant period, a refactor may be in order. I would not, however, recommend a re-write of your system just because you don't like having SQL hard-coded in your application.
Based on your updates, I concur with Tim Pote's edits re: the learning curve to integrate ORM. However, instead of integrating ORM, you could do things like using prepared statements, which you in turn store in a properties file. Or even store your queries in the DB so that you can make subtle updates to them that can then be read in immediately without restarting your app server. Both of these strategies would declutter your Java code of hard-coded SQL.
Ultimately though, I don't think there's a clear answer to your question, because there's nothing inherently wrong with what you're doing. It's just a bit inflexible, but perhaps acceptably so for your circumstances.
That said, I'm posting this as an answer!
I'm not sure of the state of the project but you may also be able to find an 'alternate' object relational mapper called MyBatis. It has a lower learning curve than the popular hibernate or eclipselink and let's you actually write the queries so you know what the code is doing. That is if ORM is your thing.
I'm working with JPA right now (mainly because it is the current trend and it needs to be learned). JPA is the Java standard for ORM. If you are going to learn what is currently a typical ORM way of doing things, JPA is probably the best way to go. Frameworks like Hibernate and Eclipselink drive it. Depending on what framework you choose to underpin your JPA app, you can use proprietary features but that will tie you to that framework pretty much for good. JPA is not hard to start using, but can be very cryptic when it doesn't work since it obfuscates the interaction with the database quite a bit (mind you, it does allow the option using native SQL queries, but that kind of negates the reason why people say JPA style DB access is good).
And yes, there are still people using JDBC with prepared statements. And normally there are practices/patterns that you will use when programming with plain old JDBC that act like a very, very minimalist ORM... or really, closer to MyBatis. Again, if you go this route, use prepared statements. They negate a number of dangers.
This is a religious kind of question, so you will hear a lot of proselytizing the way you wrote the question. In fact someone might shoot down your question for this. I think the only thing you could ask that might be worse is whether emacs or vi is better to a crowd of unix geeks.
Your question seems too generic, however if you have a mix of Direct SQL on Oracle and Java SQL, it would be better to invest some time in an ORM like Hibernate or Apache Cayenne. The ORM is a separate design approach to segregate Database operations from the Java side. All the db interactions and DB design is implemented on the ORM and all the access and business logic will reside in Java, this is a suggestion. Still unclear about your actual problem though.
The biggest concern I have with how it's done today is that mixing
Java code and a SQL queries look "out-of-place" when coupled as
tightly as it is (Queries hardcoded as part of source code)
This assumption of yours is not really "correct" in a way that there is going to be a true / false answer to your question. This question here explains that there are several ways of dealing with mixing Java and SQL:
Java Programming - Where should SQL statements be stored?
It essentially distinguishes between SQL being:
Hardcoded in business objects
Embedded in SQLJ clauses
Encapsulated in separate classes e.g. Data Access Objects
Metadata driven (decouple the object schema from the data schema - describe the mappings between them in metadata)
Put into external files (e.g. Properties or Resource files)
Put into stored procedures
I'll add to that:
Embedded in CriteriaQuery statements
Embedded in jOOQ statements.
Apache Cayenne, is one of the easiest ORM to use. It comes with a Cayenne Modeller to Model data objects and does mappings. I would recommend Cayenne for a beginner in ORM. It can create mapping classes and DB sync through the modeller.
I have to create an mysql database to be used by several applications in parallel for the first time. Up until this point my only experience with mysql databases have been single programs (for example webservers) querying the database.
Now i am moving into a scenario where i will have several CXF java servlet type programs, as well as a background server editing and reading on the same schemas.
I am using the Connector/J JDBC driver to connect to the database in all instances.
My question is this: What do i need to do in order to make sure that the parallel access does not become a problem. I realize that i need to use transactions where appropriate, but where i am truly lost is in the management.
For example.
Do i need to close the connection every time a servlet is done with a job?
Do i need a unique user for each program accessing the database?
Do i have to do something with my Connector/J objects?
Do i have to declare my tables in a different way?
Did i miss anything or is there something i failed to think about?
I have a pretty good idea about how to handle transactions and the SQL itself, but i am pretty lost when it comes to what i need to do when setting up my database.
You should maintain a pool of connections. Connections are really expensive to create think on the order of of several hundred milliseconds. So for high volume apps it makes sense to cache and reuse them.
For your servlet it depends on what container you are using. Something like JBoss will provide pooling as part of the container. It can be defined through the datasource definition and accessed through JNDI. Other containers like tomcat may rely on something like C3PO.
Most of these frameworks return custom implementations of JDBC connections that implement the close() methods with logic that returns the connection to the pool. You should familiarize yourself with the details of your concrete implementation to make sure you are doing things in a way that is supported
As for the concurrency considerations, you should familiarize yourself with concepts of optimistic/pessimistic locking and transaction isolation levels. These have trade offs where the correct answer can only be determined given the operational context of your application.
Considering the user, Most applications have one user that represents the application called the read/write user. This user should only have privilege to read and write records from the tables,indices,sequences, etc. that are associated with your application. All the instances of the application will specify this user in their connection string.
If you familiarize yourself with the concepts above, you'll be about 95% of the way there.
One more thing. As pointed out in the comments on the administration side your database engine is a huge consideration. You should familiarize yourself with the differences and the tuning/configuration options.
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
this is just theoretical question.
I use JDBC with my Java applications for using database (select, insert, update, delete or whatever).
I make "manually" Java classes which will contain data from DB tables (attribute = db column). Then I make queries (ResultSet) and fill those classes with data. I am not sure, if this is the right way.
But I've read lot of about JDO and another persistence solutions.
Can someone please recommend the best used JDBC alternatives, based on their experience?
I would also like to know the advantages of JDO over JDBC (in simple words).
I've been able to google lot of this stuff, but opinions from the "first hand" are always best.
Thanks
The story of database persistence in Java is already long and full of twists and turns:
JDBC is the low level API that everybody uses at the end to talk to a database. But without using a higher level API, you have to do all the grunt work yourself (writing SQL queries, mapping results to objects, etc).
EJB 1.0 CMP Entity Beans was a first try for a higher level API and has been successfully adopted by the big Java EE providers (BEA, IBM) but not by users. Entity Beans were too complex and had too much overhead (understand, poor performance). FAIL!
EJB 2.0 CMP tried to reduce some of the complexity of Entity Beans with the introduction of local interfaces, but the majority of the complexity remained. EJB 2.0 also lacked portability (because the object-relational mapping were not part of the spec and the deployment descriptor were thus proprietary). FAIL!
Then came JDO which is a datastore agnostic standard for object persistence (can be used with RDBMS, OODBMS, XML, Excel, LDAP). But, while there are several open-source implementations and while JDO has been adopted by small independent vendors (mostly OODBMS vendors hoping that JDO users would later switch from their RDBMS datastore to an OODBMS - but this obviously never happened), it failed at being adopted by big Java EE players and users (because of weaving which was a pain at development time and scaring some customers, of a weird query API, of being actually too abstract). So, while the standard itself is not dead, I consider it as a failure. FAIL!
And indeed, despite the existence of two standards, proprietary APIs like Toplink, an old player, or Hibernate have been preferred by users over EJB CMP and JDO for object to relational database persistence (competition between standards, unclear positioning of JDO, earlier failure of CMP and bad marketing have a part of responsibility in this I believe) and Hibernate actually became the de facto standard in this field (it's a great open source framework). SUCCESS!
Then Sun realized they had to simplify things (and more generally the whole Java EE) and they did it in Java EE 5 with JPA, the Java Persistence API, which is part of EJB 3.0 and is the new standard for object to relational database persistence. JPA unifies EJB 2 CMP, JDO, Hibernate, and TopLink APIs / products and seems to succeed where EJB CMP and JDO failed (ease of use and adoption). SUCCESS!
To summarize, Java's standard for database persistence is JPA and should be preferred over others proprietary APIs (using Hibernate's implementation of JPA is fine but use JPA API) unless an ORM is not what you need. It provides a higher level API than JDBC and is meant to save you a lot of manual work (this is simplified but that's the idea).
If you want to write SQL yourself, and don't want an ORM, you can still benefit from some frameworks which hides all the tedious connection handling (try-catch-finally). Eventually you will forget to close a connection...
One such framework that is quite easy to use is Spring JdbcTemplate.
I can recommend Hibernate. It is widely used (and for good reasons), and the fact that the Java Persistence API specification was lead by the main designer of Hibernate guarantees that it will be around for the foreseeable future :-) If portability and vendor neutrality is important to you, you may use it via JPA, so in the future you can easily switch to another JPA implementation.
Lacking personal experience with JDO, I can't really compare the two. However, the benefits of Hibernate (or ORM in general) at first sight seem to be pretty much the same as what is listed on the JDO page. To me the most important points are:
DB neutrality: Hibernate supports several SQL dialects in the background, switching between DBs is as easy as changing a single line in your configuration
performance: lazy fetching by default, and a lot more optimizations going on under the hood, which you woulds need to handle manually with JDBC
you can focus on your domain model and OO design instead of lower level DB issues (but you can of course fine-tune DML and DDL if you wish so)
One potential drawback (of ORM tools in general) is that it is not that suitable for batch processing. If you need to update 1 million rows in your table, ORM by default will never perform as well as a JDBC batch update or a stored procedure. Hibernate can incorporate stored procedures though, and it supports batch processing to some extent (I am not familiar with that yet, so I can't really say whether it is up to the task in this respect compared to JDBC - but judging from what I know so far, probably yes). So if your app requires some batch processing but mostly deals with individual entities, Hibernate can still work. If it is predominantly doing batch processing, maybe JDBC is a better choice.
Hibernate requires that you have an object model to map your schema to. If you're still thinking only in terms of relational schemas and SQL, perhaps Hibernate is not for you.
You have to be willing to accept the SQL that Hibernate will generate for you. If you think you can do better with hand-coded SQL, perhaps Hibernate is not for you.
Another alternative is iBatis. If JDBC is raw SQL, and Hibernate is ORM, iBatis can be thought of as something between the two. It gives you more control over the SQL that's executed.
JDO builds off JDBC technology. Similarly, Hibernate still requires JDBC as well. JDBC is Java's fundamental specification on database connectivity.
This means JDBC will give you greater control but it requires more plumbing code.
JDO provide higher abstractions and less plumbing code, because a lot of the complexity is hidden.
If you are asking this question, I am guessing you are not familiar with JDBC. I think a basic understanding of JDBC is required in order to use JDO effectively, or Hibernate, or any other higher abstraction tool. Otherwise, you may encounter scenario where ORM tools exhibit behavior you may not understand.
Sun's Java tutorial on their website provide a decent introductory material which walks you through JDBC. http://java.sun.com/docs/books/tutorial/jdbc/.
Have a look at MyBatis. Often being overlooked, but great for read-only complex queries using proprietary features of your DBMS.
http://www.mybatis.org
Here is how it goes with java persistence. You have just learnt java, now you want to persist some records, you get to learn JDBC. You are happy that you can now save your data to a database. Then you decide to write a bit bigger application. You realize that it has become tedious to try, catch , open connection, close connection , transfer data from resultset to your bean .... So you think there must be an easier way. In java there is always an alternative. So you do some googling and in a short while you discover ORM, and most likely, hibernate. You are so exited that you now dont have to think about connections. Your tables are being created automatically. You are able to move very fast. Then you decide to undertake a really big project, initially you move very fast and you have all the crud operations in place. The requirements keep comming, then one day you are cornered. You try to save but its not cascading to the objects children. Somethings done work as explained in the books that you have read. You dont know what to do because you didnt write the hibernate libraries. You wish you had written the SQL yourself. Its now time to rethink again... As you mature , you realize that the best way to interact with the Database is through SQL. You also realize that some tools get you started very fast but they cant keep you going for long. This is my story. I am now a very happy ibatis/User.
Ebean ORM is another alternative http://ebean-orm.github.io/
Ebean uses JPA Annotations for Mapping but it is architected to be sessionless. This means that you don't have the attached/detached concepts and you don't persist/merge/flush - you just simply save() your beans.
I'd expect Ebean to be much simplier to use than Hibernate, JPA or JDO
So if you are looking for a powerful alternative approach to JDO or JPA you could have a look at Ebean.
JPA/Hibernate is a popular choice for ORM. It can provide you with just about every ORM feature that you need. The learning curve can be steep for those with basic ORM needs.
There are lots of alternatives to JPA that provide ORM with less complexity for developers with basic ORM requirements. Query sourceforge for example:
http://sourceforge.net/directory/language:java/?q=ORM
I am partial to my solution, Sormula: sourceforge or bitbucket. Sormula was designed to minimize complexity while providing basic ORM.
Hibernate, surely. It's popular, there is even a .NET version.
Also, hibernate can be easily integrated with Spring framework.
And, it will mostly fit any developer needs.
A new and exciting alternative is GORM, which is the ORM implementation from Grails. Can now be used stand alone.
Under the hood it uses Hibernate, but gives you a nice layer on top with cool dynamic finders etc.
All these different abstraction layers eventually use JDBC. The whole idea is to automate some of the tedious and error prone work much in the same way that compilers automate a lot of the tedious work in writing programs (resizing a data structure - no problem, just recompile).
Note, however, that in order for these to work there are assumptions that you will need to adhere to. These are usually reasonable and quite easy to work with, especially if you start with the Java side as opposed to have to work with existing database tables.
JDO is the convergence of the various projects in a single Sun standard and the one I would suggest you learn. For implementation, choose the one your favorite IDE suggests in its various wizards.
There is also torque (http://db.apache.org/torque/) which I personally prefer because it's simpler, and does exactly what I need.
With torque I can define a database with mysql(Well I use Postgresql, but Mysql is supported too) and Torque can then query the database and then generate java classes for each table in the database. With Torque you can then query the database and get back Java objects of the correct type.
It supports where clauses (Either with a Criteria object or you can write the sql yourself) and joins.
It also support foreign keys, so if you got a User table and a House table, where a user can own 0 or more houses, there will be a getHouses() method on the user object which will give you the list of House objects the user own.
To get a first look at the kind of code you can write, take a look at http://db.apache.org/torque/releases/torque-3.3/tutorial/step5.html which contains examples which show how to load/save/query data with torque. (All the classes used in this example are auto-generated based on the database definition).
I recommend to use the Hibernate, its really fantastic way of connecting to the database, earlier there were few issues, but later it is more stable.
It uses the ORM based mapping, it reduces your time on writing the queries to an extent and it allows to change the databases at a minimum effort.
If you require any video based tutorials please let me know I can uplaod in my server and send you the link.
Use hibernate as a stand alone JAR file then distribute it to your different web apps. This far is the best solution out there. You have to design your Classes, Interfaces, Enums to do an abstract DAO pattern. As long as you have correct entities and mappings. You will only need to work with Objects(Entities) and not HSQL.