Database data upgrade with versioning by environment

Database data upgrade with versioning by environment - java

I have a java app which I deploy on various plateforms (using ansible).
This app uses a database, which sometimes needs to get schema updates, which I perform and log/version with flyway (as a software dependency).
I now face the need to update data on all plateforms, but with different values depending on the plateforms. This is not a schema update, but is nonetheless data (list of other apps to which it connects) that forms the main structure of my app, and as such I want it to be versioned, in a similar way to what flyway does.
At first I was thinking I should input the different data in my ansible configuration, which seemed to make sense as it's ansible that knows about the various plateforms. And then I thought that this information would get passed to flyway somehow so that it performs the required updates.
However if that is handled using 'versioned migrations', I could end up with version conflicts because one environment requires an update and another doesn't (common versioning vs environment versioning).
There is a slight mention of this issue in the flyway FAQ, and one can set the flyway.locations property, or maybe I could use flyway placeholders that are set by ansible ?
Am I on the right track ? Or should I not use flyway altogether (is it meant to be used with DML, or should it be reserved for DDL) ?

Flyway can be used for both schema and data updates. Although it's primary purpose is around versioning schema updates.
It sounds like you need a way to deploy some scripts only in certain environments. Flyway provides functionality that will support this workflow. However, you'll need to decide on the approach that works best for you.
Here are some ideas.
Use different locations
The simplest way I can think of is to have environment specific scripts in their own locations. You can also have a location for 'common' scripts.
When you deploy, you can specify the 'common' location, alongside the environment specific one. Something like:
flyway migration -locations=common/sql, test/sql
flyway migration -locations=common/sql, production/sql
And so on.
shouldExecute script config & placeholders
Another way is to use the Flyway Teams feature shouldExecute. This let's you define a boolean expression to determine if a script should be run. You can inject a value from a placeholder. There is a blog post that explains more about it.
Use the cherryPick configuration option
Another Teams Edition feature is cherryPick, which allows you to specify exactly which scripts to deploy. So you might have a configuration file per environment with a cherryPick config that specifies the exact scripts to run. This one might be unwieldy since you need to explicitly list every script, but it does give you complete control.

Related

Using Flyway for two databases, but only one at a time

I'm testing out PostgreSQL and CockroachDB with my application. I've got it such that I can run my application with either PostgreSQL OR CockroachDB. Is it possible to set Flyway up such that I can run either with Flyway support without errors occurring from also having it configured for the other database I'm not using at the moment?
I've tried looking for documentation that answers this, but it seems that most documentation in this area pertains to running both databases concurrently, which isn't what I'm trying to do here.
Not a huge deal, but I am curious... Thank you!

The default behavior of Flyway uses the config file. Issuing a command like flyway migrate will go to the configured database with the designated locations (folders where the migrations are stored).
So, to be able to switch on the fly, you have two choices. You can create two config files and then set them on execute from the command line, or, take direct control of the configuration settings through the command line. So, two different command lines with the appropriate settings for where the migrations are stored and how to connect to them should let you do exactly this.

Maven: How to create a correct JDBC driver dependency, if I don't know which database the client will use?

I'm currently learning to use maven, I understood how to create a maven project using dependencies from maven repository - and now I have the following question:
If I have an application which uses a database access, for example via Hibernate, then I need to add a dependency representing the corresponding database driver, for example mysql-connector-java for MySql, ojdbc for Oracle and so on.
But what if I want the program to run on a different machine and I don't know what database engine it uses? What is the common way to solve this? Just import all possible drivers as dependencies? Or is there a more elegant way?

Using a ORM (Object Relational Mapping) like Hibernate is the best solution since you write Java code that will be interpreted by Hibernate and translated into SQL queries.
At some point, you will have to decide which database are you going to use, then you will have to add the driver.
Another solution can be making configurations for different environments using maven: https://maven.apache.org/guides/mini/guide-building-for-different-environments.html

The problem is not particularly maven bound. Whenever you move to a new database, the administrator would have to use the correct JDBC driver according to DB, and yes, it requires different jar files.
The thing is, that you don't want to bundle the database jar file with your code. It may already exist (e.g. in the application server) or you may specify a path to drop it during installation.
Assuming you are creating a webapp. If you bundle a war-file with maven, it will include all dependencies inside the war file, so you must specify that the dependency is there during compilation and testing, but not in any package. The way to do so, is by specifying it as provided
<dependency>
<groupId>some.db</groupId>
<artifactId>jdbc-driver<artifactId>
<scope>provided</scope>
</dependency>
This means that the jar file will exist on the target platform, and hence, should not be packaged in a any bundle.
So the assumption I make is really, is this a web-app? Or is it standalone? Anyway, hope it helps.

You don't need a dependency at all. What you need is a driver to be available at runtime. It's true that one way to do this is with a dependency, but if you don't know the database you can't really bundle everything in there. You could stick some most common drivers, but then as said there would be licensing issues.
If you're talking about a web application, you could just tell the user to get the appropriate driver and configure a JNDI datasource that the software uses. This is/was a standard from back in the days, but it assumes that the application is a webapp and the end user knows how to configure things (although if he doesn't, he probably shouldn't be setting up the system in the first place).
For a standalone program using a local database, you have the easy choice of using an in memory database like H2 and not allowing any other databases. Naturally this case doesn't work for everything, but I'm including it as an example. In any case it would boil down to the same as with a webapp. Have the end user get the correct driver. If they're running a database server and your app, they should be able to find the right driver too. Then you just need to make sure it's included in the runtime classpath, which might be a bit harder.
The way this is done by SquirrelSQL for example, is by explicitly selecting the drivers as shown in the below picture. This of course again means the user needs to understand what he's doing.

I assume that you want everything to happen automagically and you're not too eager to instruct each user/machine admin how to configure system to have your app working. I am afraid it is not possible in the way you might have hoped.
The standalone database solution that Kayaman suggested might be the best solution in your case but hard to say without knowing further.
However here are some aspects regarding using maven and possible difficulties with some notes.
If I have an application which uses a database access, for example via
Hibernate, then I need to add a dependency representing the
corresponding database driver, for example mysql-connector-java for
MySql, ojdbc for Oracle and so on.
Yes. And you also would need to tell hibernate about this Driver and perhaps other stuff related. It is not just adding dependency but also filtering some prop file or persistence.xml. That might be a job for maven and some of its plugins. But still it would require knowledge about all the possible db alternatives and maven profiles for each of those to handle them.
But what if I want the program to run on a different machine and I
don't know what databse it uses? What is the common way to solve this?
What options do I have? Just import all possible drivers as
dependencies? Is there a more elegant way?
All programs have dependencies. Was it related to DB or not. In a sense as other answers suggest this is not maven specific (but quite related still! ) thing. You need to be aware of the requirements of any environment if you really want to develop on the level of JDBC drivers.
This specific question of yours is something that - I believe - is the motivation to develop things like:
ODBC
JNDI
NOTE 1 even similar naming ODBC & JDBC are totally in different level (I mean how JDBC drivers are found which actually might be the main problem...)
NOTE 2 JNDI is not restricted to DataSources
However maven can be a great help depending on what you need and finally decide to do. But not in so big role if you can use ODBC / JNDI.

where to store/keep configuration data for an Java Enterprise application

What is the best way to store parameters and data for an EE7 application. I have to provide the web applications with information like a member fee or similar data (which may/can be altered several times in a year). The owner of the application should also have a central place where these data are stored and an application to change them.
Thanks in advance for any input
Franz

This is one question we are currently struggling with as we re-architect some of our back-end systems here, and I do agree with the comment from #JB Nizet that it should be stored on the database, however I will try to add some additional considerations and options to help you make the decision that is right for you. The right option will depend on a few factors though.
If you are delivering source code and automation to build and deploy your software, the configuration can be stored in a source code repository (i.e. as YAML or XML) and bundled with your deployable during the build process. This is a bit archaic but certainly widely adopted practice and works well, for the most part.
If you are delivering deployable binaries, you have a couple of options.
First one is to have a predetermined place in the file system where your application will look for an "override" configuration file (i.e. home directory of the user used to run your application server). This way you can have your binary deployable file completely separate from your configuration, but you will still need to build some sort of automation and version control for that configuration file so that your customer can roll back versions if/when necessary. This can also be one or many configuration files (i.e. separate files for your app server, vs. the application itself).
The option we are contemplating currently is having a configuration database where all of our applications can query for their own configuration. This can either be a very simple or complex solution depending on your particular needs - for us these are internal applications and we manage the entire lifecycles ourselves, but we have a need to have a central repository since we have tens of services and applications running with a good number of common configuration keys, and updating these keys independently can be error prone.
We are looking at a few different solutions, but I would certainly not store the configuration in our main database as: 1) I don't think SQL is best repository for configuration, 2) I believe we can get better performance from NoSQL databases which can be critical if you need to load some of those configuration keys for every request.
MongoDB and CouchDB both come to mind as good candidates for storing the our configuration keys if you need clearly defined hierarchy for you options, whereas Redis or Memcached are great options if you just need a key-value storage for your configuration (faster than document based too). We will also likely build a small app to help up configure and version the configuration and push changes to existing/active servers, but we haven't spec'd out all the requirements for that.
There are also some OSS solutions that may work for you, although some of them add too much complexity for what we are trying to achieve at this point. If you are using springframework, take a look at the Spring Cloud Config Project, it is very interesting and worth looking into.
This is a very interesting discussion and I am very willing to continue it if you have more questions on how to achieve distributed configurations. Food for thought, here are some of my personal must haves and nice to haves for our new configuration architecture design:
Global configuration per environment (dev,staging,prod)
App specific configuration per environment (dev,staging,prod)
Auto-discovery (auto environment selection depending on requestor)
Access control and versioning
Ability to push updates live to different services

Roger,thanks a lot. Do you have an example for the version predetermined place in the file system"predetermined place in the file system"? Does it make sense to use a singleton which reads the configuration file (using Startup annotation) and provides then the configuration data? But this does not support a dynamic solution.kind regards Franz

Modular Database Initialization

While working on a modular system architecture for an enterprise application I run into some problems with database initialization. We have a core library that provides base entities and base configuration. On top of this core several modules are build. They are pluggable and can have their own entities and configuration. Some characteristics:
Configuration, like system properties, resourcebundles, etc, are all stored in the database.
JPA is used to make the system database independent.
System runs on Java SE
Every module can bring its own tables, but they can also require to populate the core property table, or the core resourcebundle table. So somehow we need some mechanisme to run a DDL and DML initialization for the database. Some options:
Create simple sql scripts. Disadvantage is that they must be database independent and perhaps this is not the most developer friendly. Unless we can generate them with some DB diff tool?
Use Java classes to initialize via JPQL?
Store configuration in files? This avoids a lot (but not all) of configuration DML.
Use some tool like liquibase?
What would be the best practice for this (or a similar) problem?

Use a database for store all configuration data is the best option. Many products, such as WebSphere Portal or Liferay use a database to store the configuration data for each portlet or even for theme. Don't forget to include those that are used as part of an SOA and Business Rules.
Therefore, the use of SQL scripts is also the best choice. However, if you require very specific features of SQL, you may need to create several versions of same script for each database management system.

I am currently in an project that has the same idea of modules that add functionality to a core system.
Generally we are using maven and multiple src folders as well as maven profiles and different builds to be able to generate a deployable with different modules. (we do not have the necessity to push out single modules and install them later on - this might be different in your project. We just build different versions with different modules.)
Anyway, for the DB we are using liquibase. Firstly to manage the DB and the changes done to it. But also (and this might be helpful to you) to include/generate another SQL script that adds tables for the modules.
Each module has its own changeset-file that includes everything that is necessary for that module (also in different versions as the modules evolve through time). These can then be applied or not.
So, I think liquibase could also be useful in your case (even though it's main purpose is to manage DB changes).

multiple java servers and batch programs - XML configuration nightmare

I have an application that consists of approx. 20 java components.
About half of the components are servers and the other half batch programs.
Almost all of them talk directly to an oracle database (jdbc via some of our infrastructure code jars) the other couple of components talk to some of the servers which talk to the database.
Anyway, each component is configured with numerous XML configuration files.
These are becoming almost impossible to maintain.
Some of the configuration is specific to a component others are similar (database URLs, connectors etc)
What is worse is that the application is not installed in many environments - in fact only about 10 environments (qa, dev, production etc etc).
But the people who own these environments don't seem able to maintain the configs correctly.
In particular whenever there is an upgrade there is invariably configuration errors.
I have even started checking in some of the environments configurations into SVN along with the code.
I tried an xml schema validator at one point (it consisted of defining the valid XML in .xsd files and then throwing an error if the schema rules were breached but that didnt work)
I'm thinking I am missing something basic here - perhaps there is a tool to manage this or perhaps I should be storing the configuration in the database.
The application was largely designed by a colleague but I feel myself that it's overly configurable - in fact many of the config actually refers to classes - i.e. one can choose handlers and parsers etc - the XML config almost looks like code.
Any advice greatly appreciated
Peter

Substituting XML for code is usually a bad idea; things that are declarative are probably OK, but things that are procedural probably aren't.
If all that configuration was defined in Java code, a lot of the upgrade issues would turn into compilation issues. The compiler would pick them out for you, and you could correct them.
So you've got a multi-part problem. You need to rationalize your configuration information into a set of partitions (per-component, per-installation, global). You need to try to verify configuration information at compile-time, where possible. And you need to write validation for the loaded configurations, to sanity check them.
To the extent possible, shift configuration relatively static stuff into Guice (at least, it's what I prefer). A lot of things happen in a nice, type-safe way with it.
Consider running a WebDAV server for each instance of the app, and storing configuration into it. Each can hit a simple URL to pull the current versions of the configuration files.
Or, stand up a lightweight XML database like BaseX with its REST capability, then store and load your configuration information there. Use JSLP or something like it to have your components find the central configuration repository.
An additional advantage to using an XML DB is that you'll be able to do a lot of sanity checking and updating by querying across the set of all configuration files. For example, if a given instance of the application should have the same JDBC parameters in each configuration file, a simple xquery will tell you if that's true.
If you don't have the ability to modify the applications that are pulling the configuration file (the config file format is fixed), then consider writing a query servlets for the XML database that assemble the required configuration information, from nested blocks or templates. That will allow you to figure out what's common between the configuration files and dynamically generate parameterized versions of those blocks.
Sounds like the key here is making incremental improvements. Allow the old way to configure, but have the configuration load look for a central config source first.

I don't think that the syntax of the configuration files is at the heart of the problem: using Java properties files instead of XML would leave you with exactly the same issues. There may be an issue that the configuration information is too dispersed - it's hard to tell. The main issue seems to be that the whole thing is too fragile - the application is too dependent on manual configuration, and it seems that the configuration for each environment needs to be different. You should try to focus on reducing the number of configuration parameters that need to be set to make the system work (without necessarily reducing the options available for diagnostics etc for use when they are really needed.), on having intelligent defaults and self-configuration. Perhaps even invest in creating an installation wizard.

As you have some Oracle databases handy why not store your configuration in there?
Then you only need one or two configuration parameters to point to an Oracle database suitable for that environment and download the rest of the configuration from the database.
The contents of the configuration table should be pretty static for any given environment so there should be no need to amend anything except the jdbc connection when you migrate your software through its life cycle.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.