My application is always developing, so occasionally - when the version upgrades - some tables need to be created/altered/deleted, some data modified, etc. Generally some sql code needs to be executed.
Is there a Java library that can be used to keep my database structure up to date (by analyzing something like "db structure version" information and executing custom sql to code to update from one version to another)?
Also it would be great to have some basic actions (like add/remove column) ready to use with minimal configuration, ie name/type and no sql code.
Try DBDeploy. Although I haven't used it in the past, it sounds like this project would help in your case. DBDeploy is a database refactoring manager that:
"Automates the process of establishing
which database refactorings need to be
run against a specific database in
order to migrate it to a particular
build."
It is known to integrate with both Ant and Maven.
Try Liquibase.
Liquibase is an open source (Apache
2.0 Licensed), database-independent library for tracking, managing and
applying database changes. It is built
on a simple premise: All database
changes are stored in a human readable
yet trackable form and checked into
source control.
Supported features:
Extensibility
Merging changes from multiple developers
Code branches
Multiple Databases
Managing production data as well as various test datasets
Cluster-safe database upgrades
Automated updates or generation of SQL scripts that can be approved and
applied by a DBA
Update rollbacks
Database ”diff“s
Generating starting change logs from existing databases
Generating database change documentation
We use a piece of software called Liquibase for this. It's very flexible and you can set it up pretty much however you want it. We have it integrated with Maven so our database is always up to date.
You can also check Flyway (400 questions tagged on SOW) or mybatis (1049 questions tagged). To add to the comparison the other options mentioned: Liquibase (663 questions tagged) and DBDeploy (24 questions tagged).
Another resource that you can find useful is the feature comparison in the Flyway website (There are other related projects mentioned there).
You should take a look into OR Mapping libraries, e.g. Hibernate
Most ORM mappers have logic to do schema upgrades for you, I have successfully used Hibernate which gets at least the basic stuff right automatically.
Related
I've set a Java project using jOOQ. Currently we are about to create a CI pipeline on Jenkins.
Ideally, we wouldn't like commit the generated code on the repo but to generate it on the building process. However jOOQ requires to connect to the database in order to generate the code.
The first approach would be to allow Jenkins to connect to the database. In case we are forbidden to access to the DB from Jenkins, which are the approaches we should consider?
Any comments or hint is welcomed and much appreciated.
Why not commit generated code to a repository?
There are pros and cons to each approach as you have noticed, but in general, committing the generated code has more pros. Look at that code like any other library with its own release cycle and versioning. You might have such libraries, and you might call them libraryAbc-1.3.17.jar and you don't have any issues committing that jar file to the repository, right? Especially, when it's a third party dependency.
Here's an interesting article illustrating the above with more details:
https://blog.jooq.org/2014/09/08/look-no-further-the-final-answer-to-where-to-put-generated-code
And a recent discussion on the jOOQ user group:
https://groups.google.com/d/msg/jooq-user/M3PKEhrXnZ8/0PyFVMfQAgAJ
Options for regenerating code without a database connection
Notice how that discussion references an option for re-generating the code from a meta model that is not the database, e.g.:
The XMLDatabase
The JPADatabase
The DDLDatabase
All of these have the advantage of taking a meta model from the file system at the price that they don't support all the vendor-specific functionality that would be supported if connecting directly to the database.
But why not use testcontainers with your actual database product? An example can be seen here.
We are doing a project in which we have planned to use JPA Persistence. We think that once the project goes live, there is a small chance that changes in the data model might be required.
My query is that what are the different strategies available to handle such a change. Particularly I have following questions:
With updated JPA classes, what are the best practices to incorporate them in the existing database schema?
With JPA, are there any best practices to, archive old data, update database schema, and again migrate the database to the new schema?
What are the various kinds of changes (broadly speaking) that will make such a migration impossible?
In RHQ (http://rhq-project.org/ ) we have some dbutils that have a schema description in XML that serves to populate the initial schema on an empty database and then another xml file that registers changes to this base schema as individual "diffs" of DDL and DML statements.
Whenever a JPA class is changed (in a schema relevant way), both XML files are updated. On the next run of the installer, it will look at the existing database, gather its version and then play all the update steps from the version in the DB to the most current one.
This dbutils code is available in git.
There are other frameworks around like liquibase that can help you here.
You can also take a look at this framework:
http://flywaydb.org
Advertised as: "The agile database migration framework for Java"
In my experience, migrations are not the problem (hibernate can do them automatically), but rollbacks are, if you are dealing with destructive changes. For example, if you remove a column, there's no way to rollback that change, unless you have the data from that column backed up somewhere. Best way do such backups probably depends on your DB vendor.
I was learning some JPA to teach to some java friends and I was wondering, how do you handle updates that comes after the creation of the db in JPA? Let's say I have a production environment where there's data that I cannot lose.
Some changes comes in and how do I apply that on my production environment? It there a way that JPA would only update the changes on the database?
Or do I need to manually create a SQL script to update my database?
Is there any other options?
[]'s
Rodrigo Dellacqua
Some changes comes in and how do I apply that on my production environment? It there a way that JPA would only update the changes on the database?
Nothing standardized. In other words, that would be a provider specific feature. For example, Hibernate has a SchemaUpdate tool that can (in theory) safely update a database schema. In practice, many don't use that on a production database (including me).
Or do I need to manually create a SQL script to update my database?
Using migration scripts (and maybe a database migration tool) is IMO the safe way to handle this and is the way to go on real life projects.
And again, some migration tools might provide support for a given JPA provider. For example, liquibase does offer Hibernate support and can diff your Entities against a database to generate a change script.
I just wanted to hear the opinion of Hibernate experts about DB schema generation best practices for Hibernate/JPA based projects. Especially:
What strategy to use when the project has just started? Is it recommended to let Hibernate automatically generate the schema in this phase or is it better to create the database tables manually from earliest phases of the project?
Pretending that throughout the project the schema was being generated using Hibernate, is it better to disable automatic schema generation and manually create the database schema just before the system is released into production?
And after the system has been released into production, what is the best practice for maintaining the entity classes and the DB schema (e.g. adding/renaming/updating columns, renaming tables, etc.)?
It's always recommended to generate the schema manually, preferably by a tool supporting database schema revisions, such as the great Liquibase. Generating the schema from the entities is great in theory, but were fragile in practice and causes lots of problems in the long run(trust me on this).
In productions it's always best to have manually generated and review the schema.
You make an update to an entity and create a matching update script(revision) to update your database schema to reflect the entity change. You can create a custom solution(I've written a few) or use something more popular like liquibase(it even supports schema changes rollbacks). If you're using a build tool such as maven or ant - it's recommend to plug the db schema update util into the build process so that fresh builds stay in sync with the schema.
Although disputable, I'd say that the answer to all 3 questions is: let hibernate automatically generate the tables in the schema.
I haven't had any problems with that so far. You might need to clean some field up manually from time to time, but this is no headache compared to separately keeping track of DDL scripts - i.e. managing their revisions and synchronizing them with entity changes (and vice-versa)
For deploying on production - an obvious tip - first make sure everything is generated OK on the test environment and then deploy on production.
Manually, because:
Same database may be used by different applications and not all of
them would be using hibernate or even java. Database schema should
not be dictated by ORM, it should be designed around the data and
business requirements.
The datatypes chosen by hibernate might not be best suited for the application.
As mentioned in an earlier comment, changes to the entities would require manual intervention if data loss is not acceptable.
Things such as additional properties (generic term not java
properties) on join tables work wonderfully in RDBMS but are
somewhat complex and inefficient to use in an ORM. Doing such a
mapping from ORM -> RDBMS might create tables that are not
efficient. In theory, it is possible to build the exact same join
table using hibernate generated code, but it would require some
special care while writing the Entities.
I would use automatic generation for standalone applications or databases that are accessed via the same ORM layer and also if the app needs to be ported to different databases. It would save lot of time in by not requiring one to write and maintain DB vendor specific DDL scripts.
Like Bozhidar said, don´t let Hibernate create&update the database schema.
Let your application create and update the database schema.
For java the best tool to do this is Flyway. You need to create one or more SQL files with DDL statements which are describing your database schema. These SQL files are then executed by Flyway. For more information look at the site of Flyway.
I believe that a lot of what is being discussed or argued here should also be related to if you are more confortable with the code-first or the database-first approach.
Personally, I am more intended to go for latter and, making a reference to Single Responsibility Principle (SRP), I prefer having DB specialist handling the DB and an application specialist handling the application, than having the application handling the DB. Additionally, I am of the opinion that taking too many shortcuts will work fine at the beginning but create unmanageable problems as things grow/evolve.
I'm working on a Java web application (Adobe Flex front-end, JPA/Hibernate/BlazeDS/Spring MVC backend) and will soon reach the point where I can no longer wipe the database and regenerate it.
What's the best approach for handling changes to the DB schema? The production and test databases are SQL Server 2005, dev's use MySQL, and unit tests run against an HSQLDB in-memory database. I'm fine with having dev machines continue to wipe and reload the DB from sample data using Hibernate to regenerate the tables. However, for a production deploy the DBA would like to have a DDL script that he can manually execute.
So, my ideal solution would be one where I can write Rails-style migrations, execute them against the test servers, and after verifying that they work be able to write out SQL Server DDL that the DBA can execute on the production servers (and which has already been validated to work agains the test servers).
What's a good tool for this? Should I be writing the DDL manually (and just let dev machines use Hibernate to regenerate the DB)? Can I use a tool like migrate4j (which seems to have limited support for SQL Server, if at all)?
I'm also looking to integrate DB manipulation scripts into this process (for example, converting a "Name" field into a "First Name", "Last Name" field via a JDBC script that splits all the existing strings).
Any suggestions would be much appreciated!
What's the best approach for handling changes to the DB schema?
Idempotent change scripts with a version table (and a tool to apply all the change scripts with a number greater than the version currently stored in the version table). Also check the mentioned post Bulletproof Sql Change Scripts Using INFORMATION_SCHEMA Views.
To implement this, you could roll out your own solutions or use existing tools like DbUpdater (mentioned in the comments of change scripts), LiquiBase or dbdeploy. The later has my preference.
I depend on hibernate to create whatever it needs on the production server. There's no risk of losing data because it never removes anything: it only adds what is missing.
On the current project, we have established a convention by which any feature which requires a change in the database (schema or data) is required to provide it's own DDL/DML snippets, meaning that all we need to do is to aggregate the snippets into a single script and execute it to get production up to date. None of this works on a very large scale (order of snippets becomes critical, not everyone follows the convention etc.), but in a small team and an iterative process it works just fine.