When to use schema.sql in spring boot app - java

I am trying to understand when to use schema.sql db creation technique and when to rely on Spring boot's creation based on my entity classes. How to decide?

Let's leave for the moment the schema.sql out.
ORM automatic schema creation (create or update or create-drop) is normally useful during the development of application. Even during release candidates and QA reviews it is still useful because the changes which happen from development team with issues arising could be live much faster.
When the application reaches a critical phase and is mature in production, then any changes happening automatically from ORM could be considered dangerous.
At this stage you would normally in big companies rollout in production database only some sql scripts that affect the database, which should be reviewed first and tested before rollout happens. Also a rollback sql might be necessary.
So normally ORM having effect in database schema is only during early stages of an application and not when it has matured enough in production.
Let's now come back to schema.sql. This file can be used just once to create the database from some sql commands. This also would not be expected to be run anytime the application executes. At least not in majority of applications.
I think a logical approach would be to use the ORM during intial stages of development and QA and then when you are about to reach a mature phase, you inspect what type of database the ORM has created and then you make a manual review just to be sure of everything and any optimizations which would make sense and at this stage you can create with the already existing schema of ORM your own permanent schema.sql.
The obvious reason for the above is that with schema.sql you have 100% control of how the database is created. Using ORM you depend on the ORM provider to build this for you. This ORM provider might provide some new library that affects how the previous was used and many other things. The result is that with ORM you don't have 100% control on the database to be created.

Related

JPA/Hibernate - how to generate ddl for some classes but not others

Is it possible to turn off DDL generation on the class level vs the full application?
I have a reporting app that we have thus far had the setting in application.properties
jpa.generateDdl=false
We effectively have several different views in this app that populate java objects annotated with #Entity. This works great.
But now we want to add other objects which we DO want to persist.
If I switch on jpa.generateDdl=true, then it will generate tables for these views, which we want to avoid. Is there a way to prevent this from happening, while still allowing other objects to persist?
No that's not possible.
As written in the Hibernate documentation you shouldn't relay on the feature for production applications.
Although the automatic schema generation is very useful for testing
and prototyping purposes, in a production environment, it’s much more
flexible to manage the schema using incremental migration scripts.
https://docs.jboss.org/hibernate/orm/current/userguide/html_single/Hibernate_User_Guide.html#schema-generation
I would recommend to use Liquibase or Flyway to use for database migrations.

Versioning system for database

We have a java application which uses spring boot and hibernate.
There are many changes on entities and fields. That's why, I want to follow changes and rollback mechanism. So that, I need a version control system over database. I checked flyway and liquibase, but I think those don't solve my problem. Because my table creations and updates are handled by hibernate.
Is there any way to see which queries are executed by hibernate to change the database and which changes have occurred since the latest database change (I mean new table, column creation or refactoring)?
One way to do it (how we do it):
Use 2 databases. A reference database and a development database.
On the development database use hibernate to let it create the strucutre.
Once a development cycle is done you run liquibase diffChangelog on the reference database. It will create a changelog.xml with all changes that have been done by hibernate on the development db. Manually correct it (names, etc).
When your happy with changelog file and the development cylce is done apply the changelog onto the reference database.
Start your next development cycle and repeat.
That way you can combine the advantages of letting hibernate generate the schema and still use liquibase to have a versioned DB-Schema that is re-creatable.
Use those tools as those are dedicated for this purpose. Personally, I like Liquibase more but it's your choice.
Hibernate's schema creation mechanism should not be used in production as then your Java description of the entity would drive the creation of the tables resulting in an inefficient structure.
That feature is only there for testing purposes.

Hibernate + MySQL Best practices for reporting data

I am creating a webapp in Spring Boot (Spring + Hibernate + MySQL).
I have already created all the CRUD operations for the data of my app, and now I need to process the data and create reports.
As per the complexity of these reports, I will create some summary or pre proccesed tables. This way, I can trigger the reports creation once, and then get them efficiently.
My doubt is if I should build all the reports in Java or in Stored Procedures in MySQL.
Pros of doing it in Java:
More logging
More control of the structures (entities, maps, list, etc)
Catching exceptions
If I change my db engine (it would not happen, but never know)
Cons of doing it in Java:
Maybe memory?
Any thoughts on this?
Thanks!
Java. Though both are possible. It depends on what is most important and what skills are available for maintenance and the price of maintaining. Stored procedures are usually very fast, but availability and performance also depends on what exact database you use. You will need special skills, and then you have it all working on that specific database.
Hibernate does come with a special dialect written for every database to get the best performance out of the persistence layer. It’s not that fast as a stored procedure, but it comes pretty close. With Spring Data on top of that, all difficulty is gone. Maintenance will not cost that much and people who know Spring Data are more available than any special database vendor.
You can still create various “difficult” queries easily with HQL, so no block there. But Hibernate comes with more possibilities. You can have your caching done by eh-cache and with Hibernate envers you will have your audit done in no time. That’s the nice thing about this framework. It’s widely used and many free to use maven dependencies are there for the taking. And if in future you want to change your database, you can do it by changing like 3 parameters in your application.properties file when using Spring Data.
You can play with some annotations and see what performs better. For example you have the #Inheritance annotation where you can have some classes end up in the same table or split it to more tables. Also you have the #MappedSuperclass where you can have one JpaObject with the id which all your entities can extend. If you want some more tricks on JPA, maybe check this post with my answer on how to use a superclass and a general repository.
As per the complexity of these reports, I will create some summary or
pre proccesed tables. This way, I can trigger the reports creation
once, and then get them efficiently.
My first thought is, is this required? It seems like adding complexity to the application that perhaps isn't needed. Premature optimisation and all that. Try writing the reports in SQL and running an execution plan. If it's good enough, you have less code to maintain and no added batch jobs to administer. Consider load testing using E.G. jmeter or gatling to see how it holds up under stress.
Consider using querydsl or jooq for reporting. Both provide a database abstraction layer and fluent API for querying databases, which deliver the benefits listed in the "Pros of doing it in Java" section of the question and may be more suited to the problem. This blog post jOOQ vs. Hibernate: When to Choose Which is well worth a read.

java library to maintain database structure

My application is always developing, so occasionally - when the version upgrades - some tables need to be created/altered/deleted, some data modified, etc. Generally some sql code needs to be executed.
Is there a Java library that can be used to keep my database structure up to date (by analyzing something like "db structure version" information and executing custom sql to code to update from one version to another)?
Also it would be great to have some basic actions (like add/remove column) ready to use with minimal configuration, ie name/type and no sql code.
Try DBDeploy. Although I haven't used it in the past, it sounds like this project would help in your case. DBDeploy is a database refactoring manager that:
"Automates the process of establishing
which database refactorings need to be
run against a specific database in
order to migrate it to a particular
build."
It is known to integrate with both Ant and Maven.
Try Liquibase.
Liquibase is an open source (Apache
2.0 Licensed), database-independent library for tracking, managing and
applying database changes. It is built
on a simple premise: All database
changes are stored in a human readable
yet trackable form and checked into
source control.
Supported features:
Extensibility
Merging changes from multiple developers
Code branches
Multiple Databases
Managing production data as well as various test datasets
Cluster-safe database upgrades
Automated updates or generation of SQL scripts that can be approved and
applied by a DBA
Update rollbacks
Database ”diff“s
Generating starting change logs from existing databases
Generating database change documentation
We use a piece of software called Liquibase for this. It's very flexible and you can set it up pretty much however you want it. We have it integrated with Maven so our database is always up to date.
You can also check Flyway (400 questions tagged on SOW) or mybatis (1049 questions tagged). To add to the comparison the other options mentioned: Liquibase (663 questions tagged) and DBDeploy (24 questions tagged).
Another resource that you can find useful is the feature comparison in the Flyway website (There are other related projects mentioned there).
You should take a look into OR Mapping libraries, e.g. Hibernate
Most ORM mappers have logic to do schema upgrades for you, I have successfully used Hibernate which gets at least the basic stuff right automatically.

Hibernate/JPA DB Schema Generation Best Practices

I just wanted to hear the opinion of Hibernate experts about DB schema generation best practices for Hibernate/JPA based projects. Especially:
What strategy to use when the project has just started? Is it recommended to let Hibernate automatically generate the schema in this phase or is it better to create the database tables manually from earliest phases of the project?
Pretending that throughout the project the schema was being generated using Hibernate, is it better to disable automatic schema generation and manually create the database schema just before the system is released into production?
And after the system has been released into production, what is the best practice for maintaining the entity classes and the DB schema (e.g. adding/renaming/updating columns, renaming tables, etc.)?
It's always recommended to generate the schema manually, preferably by a tool supporting database schema revisions, such as the great Liquibase. Generating the schema from the entities is great in theory, but were fragile in practice and causes lots of problems in the long run(trust me on this).
In productions it's always best to have manually generated and review the schema.
You make an update to an entity and create a matching update script(revision) to update your database schema to reflect the entity change. You can create a custom solution(I've written a few) or use something more popular like liquibase(it even supports schema changes rollbacks). If you're using a build tool such as maven or ant - it's recommend to plug the db schema update util into the build process so that fresh builds stay in sync with the schema.
Although disputable, I'd say that the answer to all 3 questions is: let hibernate automatically generate the tables in the schema.
I haven't had any problems with that so far. You might need to clean some field up manually from time to time, but this is no headache compared to separately keeping track of DDL scripts - i.e. managing their revisions and synchronizing them with entity changes (and vice-versa)
For deploying on production - an obvious tip - first make sure everything is generated OK on the test environment and then deploy on production.
Manually, because:
Same database may be used by different applications and not all of
them would be using hibernate or even java. Database schema should
not be dictated by ORM, it should be designed around the data and
business requirements.
The datatypes chosen by hibernate might not be best suited for the application.
As mentioned in an earlier comment, changes to the entities would require manual intervention if data loss is not acceptable.
Things such as additional properties (generic term not java
properties) on join tables work wonderfully in RDBMS but are
somewhat complex and inefficient to use in an ORM. Doing such a
mapping from ORM -> RDBMS might create tables that are not
efficient. In theory, it is possible to build the exact same join
table using hibernate generated code, but it would require some
special care while writing the Entities.
I would use automatic generation for standalone applications or databases that are accessed via the same ORM layer and also if the app needs to be ported to different databases. It would save lot of time in by not requiring one to write and maintain DB vendor specific DDL scripts.
Like Bozhidar said, don´t let Hibernate create&update the database schema.
Let your application create and update the database schema.
For java the best tool to do this is Flyway. You need to create one or more SQL files with DDL statements which are describing your database schema. These SQL files are then executed by Flyway. For more information look at the site of Flyway.
I believe that a lot of what is being discussed or argued here should also be related to if you are more confortable with the code-first or the database-first approach.
Personally, I am more intended to go for latter and, making a reference to Single Responsibility Principle (SRP), I prefer having DB specialist handling the DB and an application specialist handling the application, than having the application handling the DB. Additionally, I am of the opinion that taking too many shortcuts will work fine at the beginning but create unmanageable problems as things grow/evolve.

Categories