I'm working on a Java web application (Adobe Flex front-end, JPA/Hibernate/BlazeDS/Spring MVC backend) and will soon reach the point where I can no longer wipe the database and regenerate it.
What's the best approach for handling changes to the DB schema? The production and test databases are SQL Server 2005, dev's use MySQL, and unit tests run against an HSQLDB in-memory database. I'm fine with having dev machines continue to wipe and reload the DB from sample data using Hibernate to regenerate the tables. However, for a production deploy the DBA would like to have a DDL script that he can manually execute.
So, my ideal solution would be one where I can write Rails-style migrations, execute them against the test servers, and after verifying that they work be able to write out SQL Server DDL that the DBA can execute on the production servers (and which has already been validated to work agains the test servers).
What's a good tool for this? Should I be writing the DDL manually (and just let dev machines use Hibernate to regenerate the DB)? Can I use a tool like migrate4j (which seems to have limited support for SQL Server, if at all)?
I'm also looking to integrate DB manipulation scripts into this process (for example, converting a "Name" field into a "First Name", "Last Name" field via a JDBC script that splits all the existing strings).
Any suggestions would be much appreciated!
What's the best approach for handling changes to the DB schema?
Idempotent change scripts with a version table (and a tool to apply all the change scripts with a number greater than the version currently stored in the version table). Also check the mentioned post Bulletproof Sql Change Scripts Using INFORMATION_SCHEMA Views.
To implement this, you could roll out your own solutions or use existing tools like DbUpdater (mentioned in the comments of change scripts), LiquiBase or dbdeploy. The later has my preference.
I depend on hibernate to create whatever it needs on the production server. There's no risk of losing data because it never removes anything: it only adds what is missing.
On the current project, we have established a convention by which any feature which requires a change in the database (schema or data) is required to provide it's own DDL/DML snippets, meaning that all we need to do is to aggregate the snippets into a single script and execute it to get production up to date. None of this works on a very large scale (order of snippets becomes critical, not everyone follows the convention etc.), but in a small team and an iterative process it works just fine.
Related
im building a Spring Boot web app with MySQL and till now i used the
spring.jpa.hibernate.ddl-auto=create-drop
in my properties file, and now i want to move to production and i cant use this line anymore because as u all know its saving the data on the memory and the worst thing its destroying all the data and create a new table every deploy.
for dev purposes its wonderful but what i need to do next cause i want it to behave exactly as i was on the ddl-auto but to persistently save the data and most inportantly never to drop the data.
P.S. the hibernate.ddl-auto has nothig to do with the JPA Repository?
cause i use Crud Repository alot and i need this to continue working with Crud Repository, will it?
the best thing to do, according to me, is:
use the option spring.jpa.hibernate.ddl-auto=create-drop to create
the DB schema and the default data (if any) in development environment
export the created DB schema in a normal DDL
give the DDL to DBAs to check if any improvement must be done (e.g.
add some indexes, review some FK etc..)
adapt JPA models after DBAs review
give the final DDL to the "production DBAs" in order to create the
final correct schema in production environment too
Regarding to your question:
the hibernate.ddl-auto has nothig to do with the JPA Repository? cause
i use Crud Repository alot and i need this to continue working with
Crud Repository, will it?
You can of course use the crud repository; this option will not influence your business logic
I hope it's useful
Angelo
You don't want to be sending DDLs around. You will end up trying to invent version control system for your scripts, by naming them specifically or putting them in folders, you will struggle communicating with DBAs, scripts breaking..
You want your database definition code to be a part of your code base so you can put it under version control (yes, git).
Try to use Liquibase for this. It will help you do automatic updates of the schema, data, everything db related, and it knows how to migrate your app's db, lets say, from 1.1 to 1.2 but also from 1.1 to 1.6. There are also other db migration tools like Flyway, you can look them up and play around.
I've got an Oracle database that has two schemas in it which are identical. One is essentially the "on" schema, and the other is the "off" schema. We update data in the off schema and then switch the schemas behind an alias which our production servers use. Not a great solution, but it's what I've been given to work with.
My problem is that there is a separate application that will now be streaming data to the database (also handed to me) which is currently only updating the alias, which means it is only updating the "on" schema at any given time. That means that when the schemas get switched, all the data from this separate application vanishes from production (the schema it is in is now the "off" schema).
This application is using Hibernate 3.3.2 to update the database. There's Spring 3.0.6 in the mix as well, but not for the database updates. Finally, we're running on Java 1.6.
Can anyone point me in a direction to updating both "on" and "off" schemas simultaneously that does not involve rewriting the whole DAO layer using Spring JDBC to load two separate connection pools? I have not been able to find anything about getting hibernate to do this. Thanks in advance!
You shouldn't be updating two seperate databases this way, especially from the application's point of view. All it should know/care about is whether or not the data is there, not having to mess with two separate databases.
Frankly, this sounds like you may need to purchase an ETL tool. Even if you can't get it to update the 'on' schema from the 'off' one (fast enough to be practical), you will likely be able to use it to keep the two in sync (mirror changes from 'on' to 'off').
HA-JDBC is a replicating JDBC Driver we investigated for a short while. It will automatically replicate all inserts and updates, and distribute all selects. There are other database specific master-slave solutions as well.
On the other hand, I wouldn't recommend doing this for 4-8 hour procedures. Better lock the database before, update one database, and then backup-restore a copy, and then unlock again.
My application is always developing, so occasionally - when the version upgrades - some tables need to be created/altered/deleted, some data modified, etc. Generally some sql code needs to be executed.
Is there a Java library that can be used to keep my database structure up to date (by analyzing something like "db structure version" information and executing custom sql to code to update from one version to another)?
Also it would be great to have some basic actions (like add/remove column) ready to use with minimal configuration, ie name/type and no sql code.
Try DBDeploy. Although I haven't used it in the past, it sounds like this project would help in your case. DBDeploy is a database refactoring manager that:
"Automates the process of establishing
which database refactorings need to be
run against a specific database in
order to migrate it to a particular
build."
It is known to integrate with both Ant and Maven.
Try Liquibase.
Liquibase is an open source (Apache
2.0 Licensed), database-independent library for tracking, managing and
applying database changes. It is built
on a simple premise: All database
changes are stored in a human readable
yet trackable form and checked into
source control.
Supported features:
Extensibility
Merging changes from multiple developers
Code branches
Multiple Databases
Managing production data as well as various test datasets
Cluster-safe database upgrades
Automated updates or generation of SQL scripts that can be approved and
applied by a DBA
Update rollbacks
Database ”diff“s
Generating starting change logs from existing databases
Generating database change documentation
We use a piece of software called Liquibase for this. It's very flexible and you can set it up pretty much however you want it. We have it integrated with Maven so our database is always up to date.
You can also check Flyway (400 questions tagged on SOW) or mybatis (1049 questions tagged). To add to the comparison the other options mentioned: Liquibase (663 questions tagged) and DBDeploy (24 questions tagged).
Another resource that you can find useful is the feature comparison in the Flyway website (There are other related projects mentioned there).
You should take a look into OR Mapping libraries, e.g. Hibernate
Most ORM mappers have logic to do schema upgrades for you, I have successfully used Hibernate which gets at least the basic stuff right automatically.
I was learning some JPA to teach to some java friends and I was wondering, how do you handle updates that comes after the creation of the db in JPA? Let's say I have a production environment where there's data that I cannot lose.
Some changes comes in and how do I apply that on my production environment? It there a way that JPA would only update the changes on the database?
Or do I need to manually create a SQL script to update my database?
Is there any other options?
[]'s
Rodrigo Dellacqua
Some changes comes in and how do I apply that on my production environment? It there a way that JPA would only update the changes on the database?
Nothing standardized. In other words, that would be a provider specific feature. For example, Hibernate has a SchemaUpdate tool that can (in theory) safely update a database schema. In practice, many don't use that on a production database (including me).
Or do I need to manually create a SQL script to update my database?
Using migration scripts (and maybe a database migration tool) is IMO the safe way to handle this and is the way to go on real life projects.
And again, some migration tools might provide support for a given JPA provider. For example, liquibase does offer Hibernate support and can diff your Entities against a database to generate a change script.
I have an application where many "unit" tests use a real connection to an Oracle database during their execution.
As you can imagine, these tests take too much time to be executed, as they need to initialize some Spring contexts, and communicate to the Oracle instance. In addition to that, we have to manage complex mechanisms, such as transactions, in order to avoid database modifications after the test execution (even if we use usefull classes from Spring like AbstractAnnotationAwareTransactionalTests).
So my idea is to progressively replace this Oracle test instance by an in-memory database. I will use hsqldb or maybe better h2.
My question is to know what is the best approach to do that. My main concern is related to the construction of the in-memory database structure and insertion of reference data.
Of course, I can extract the database structure from Oracle, using some tools like SQL Developer or TOAD, and then modifying these scripts to adapt them to the hsqldb or h2 language. But I don't think that's the better approach.
In fact, I already did that on another project using hsqldb, but I have written manually all the scripts to create tables. Fortunately, I had only few tables to create. My main problem during this step was to "translate" the Oracle scripts used to create tables into the hsqldb language.
For example, a table created in Oracle using the following sql command:
CREATE TABLE FOOBAR (
SOME_ID NUMBER,
SOME_DATE DATE, -- Add primary key constraint
SOME_STATUS NUMBER,
SOME_FLAG NUMBER(1) DEFAULT 0 NOT NULL);
needed to be "translated" for hsqldb to:
CREATE TABLE FOOBAR (
SOME_ID NUMERIC,
SOME_DATE TIMESTAMP PRIMARY KEY,
SOME_STATUS NUMERIC,
SOME_FLAG INTEGER DEFAULT 0 NOT NULL);
In my current project, there are too many tables to do that manually...
So my questions:
What are the advices you can give me to achieve that?
Does h2 or hsqldb provide some tools to generate their scripts from an Oracle connection?
Technical information
Java 1.6, Spring 2.5, Oracle 10.g, Maven 2
Edit
Some information regarding my unit tests:
In the application where I used hsqldb, I had the following tests:
- Some "basic" unit tests, which have nothing to do with DB.
- For DAO testing, I used hsqldb to execute database manipulations, such as CRUD.
- Then, on the service layer, I used Mockito to mock my DAO objects, in order to focus on the service test and not the whole applications (i.e. service + dao + DB).
In my current application, we have the worst scenario: The DAO layer tests need an Oracle connection to be run. The services layer does not use (yet) any mock objects to simulate the DAO. So services tests also need an Oracle connection.
I am aware that mocks and in-memory database are two separates points, and I will address them as soon as possible. However, my first step is to try to remove the Oracle connection by an in-memory database, and then I will use my Mockito knowledges to enhance the tests.
Note that I also want to separate unit tests from integration tests. The latter will need an access to the Oracle database, to execute "real" tests, but my main concern (and this is the purpose of this question) is that almost all of my unit tests are not run in isolation today.
Use an in-memory / Java database for testing. This will ensure the tests are closer to the real world than if you try to 'abstract away' the database in your test. Probably such tests are also easier to write and maintain. On the other hand, what you probably do want to 'abstract away' in your tests is the UI, because UI testing is usually hard to automate.
The Oracle syntax you posted works well with the H2 database (I just tested it), so it seems H2 supports the Oracle syntax better than HSQLDB. Disclaimer: I'm one of the authors of H2. If something doesn't work, please post it on the H2 mailing list.
You should anyway have the DDL statements for the database in your version control system. You can use those scripts for testing as well. Possibly you also need to support multiple schema versions - in that case you could write version update scripts (alter table...). With a Java database you can test those as well.
By the way, you don't necessarily need to use the in-memory mode when using H2 or HSQLDB. Both databases are fast even if you persist the data. And they are easy to install (just a jar file) and need much less memory than Oracle.
Latest HSQLDB 2.0.1 supports ORACLE syntax for DUAL, ROWNUM, NEXTVAL and CURRVAL via a syntax compatibility flag, sql.syntax_ora=true. In the same manner, concatenation of a string with a NULL string and restrictions on NULL in UNIQUE constraints are handled with other flags. Most ORACLE functions such as TO_CHAR, TO_DATE, NVL etc. are already built in.
At the moment, to use simple ORACLE types such as NUMBER, you can use a type definition:
CREATE TYPE NUMBER AS NUMERIC
The next snapshot will allow NUMBER(N) and other aspects of ORACLE type compatibility when the flag is set.
Download from http://hsqldb.org/support/
[Update:] The snapshot issued on Oct 4 translates most Oracle specific types to ANSI SQL types. HSQLDB 2.0 also supports the ANSI SQL INTERVAL type and date / timestamp arithmetic the same way as Oracle.
What are your unit tests for?
If they test the proper working of DDLs and stored procedures then you should write the tests "closer" to Oracle: either without Java code or without Spring and other nice web interfaces at all focusing on the db.
If you want to test the application logic implemented in Java and Spring then you may use mock objects/database connection to make your tests independent of the database.
If you want to test the working as a whole (what is against the modular development and testing principle) then you may virtualize your database and test on that instance without having the risk of doing some nasty irreversible modifications.
As long as your tests clean up after themselves (as you already seem to know how to set up), there's nothing wrong with running tests against a real database instance. In fact it's the approach I usually prefer, because you'll be testing something as close to production as possible.
The incompatibilities seem small, but really end up biting back not so long afterwards. In a good case, you may get away with some nasty sql translation / extensive mockery. In bad cases, parts of the system will be just impossible to test, which I think is an unacceptable risk for business-critical systems.