I'm using a H2 database to store my Data, and liquibase (with hibernate plugin) to check for differences between database and projet.
suppose I have followign Code:
#Entity
public class myEntity{
#Column(name="val")
private int value;
}
The database is in place and already stores some data.
Now when I rename the above Column i.e. from val to value and run liquibase:diff, the difflog says to drop the column "val" and add a column "value".
Obviously this is not what I wanted, because all the data originally stored in the "val" column would be gone.
Is there a way to tell liquibase that its not a new column, but an old renamed one?
I want to run liquibase:diff and the generated diffLog should automatically contain the rename... tag for my Column, not an add.. and a drop.. one..
Did you try using a changeset as follows (or did i get the question wrong)
<changeSet author="liquibase-docs" id="renameColumn-example">
<renameColumn columnDataType="int"
newColumnName="value"
oldColumnName="val"
remarks="A change in names"
schemaName="public"
tableName="myEntity"/>
</changeSet>
There is currently no way in general for a diff to be able to detect that the column change was a rename rather than a drop and create. This is true for any system that creates changes using diffs, not just Liquibase.
Imagine yourself as liquibase - you are given two tables that are in the states that you describe. How would you determine that a column was renamed vs. one column dropped and the other column created? The only thing I can think of is that would would need to look at the contents of the columns and see that they were 'mostly the same'. In this particular case, this is impossible because the databases that are being compared include one that is populated with data and a second that is just an empty in-memory database created by Hibernate.
Related
I am using Cassandra database integrated into a spring boot application.
My Question is around the schema actions. If I need to make structural changes to the DB, say add a column to a table, the database needs to be recreated, however this means all the existing data gets deleted:
schema-action: CREATE_IF_NOT_EXISTS
The only way I have managed to solve this is by using the RECREATE scheme action, but as mentioned earlier, this results in data-loss.
What would be the best approach to handle this? To add structural changes such as a column name with out having to recreate the database and lose all existing data?
Thanks
Cassandra does allow you to modify the schema of an existing table without recreating it from scratch, using the ALTER TABLE statement via cqlsh. However, as explained in that link, there are some important limitations on the kind of changes you can do. You cannot modify the primary key of the table at all, you can add or delete regular columns, and you can't change the type of a column to a non-compatible one.
The reason for most of these limitations is how Cassandra needs to deal with the old data that already exists in the table. For example, it doesn't make sense to say that a column A that until now contained strings - will now contain integers - how are we supposed to handle all the old values in column A which weren't integers?
As Aaron rightly said in a comment, it is unlikely you'll want to do these schema changes as part of your application. These are usually rare operations which are done manually, or via some management application - not your usual application.
I am working on spring project, we have data already in production, i am using annotation configuration for my entities, i have data already running, i want to add new data types and modify some already existing types, how to do that smoothly without the need to manually export the data and create import script for the new schema.
You can add tables, columns in existing tables and switch data types of some column but you should take care of these points:
add new column: if the column is not nullable you have to either enable not null constraint after data inserts or use a default dummy value. You can remove the default clause after insertions of good values too.
switch of type: If your db provider is capable of doing the cast directly everything is ok but if you got an error you should check at your db provider documentation if you can provide a hint for casting. For instance Postgres provides USING keyword for that.
Last thing if you try change type on column serving as FK , you should drop the fk, switch type of both columns and recreate the fk.
Finally I would advice you to use a database migration tool to handle that changes. Using tool like Flyway for instance.
I am trying to track changes made to a database (schema) using a java app. We are trying to track changes for each column/unique-constraint/index and table.
Functionally I know table.column is unique. So, if the datatype of a column changes, we know which column to find and record the change. But what if the name changes? If JDBC's result set is ordered (it asks for index), then I can rely on the order to give me the same column everytime, even if the name changes. Will there be any surprises here, since it is a result 'set'?
However, I learnt that we can change the order of the columns as well. Isn't there any unique ID associated with the columns so that they can be picked up on that basis?
I would mostly not want to use information_schema route, but even though i checked there for mysql, found nothing useful.
Just started out with my hobby project and now I am here to get help with making the correct database design/query. I have made a simple Java program that loops trough the content of a folder. I want to save this content to a MySQL database, so I added a connector to my database in Java, created a table and the columns "file", "path" and "id, "date" in MySQL.
So now to the important/fun thing, every time I want to add the filenames to the MySQL in Java I do this (when the GUI-button is pressed I call on a method that does):
DELETE all entries with the same file path - this is to ensure that I will get new entries which is exactly the same as the content in the path.
Java-loop: INSERT the file-info into the columns id, path, filename and date when the file was added to the database.
In this way I can always ensure that the filenames that are going to be added into the database always are up to date, it doesn't matter if I rename a file or remve it, it will be up to date since the table will get it's entries deleted and new info will be written. Old info -> DELETE old info - INSERT new info -> Up-to-date.
I know this is probably not the best solution but it works, but now I am stuck on the next thing I want to do. I want to add the difference of the files in order to know which files has been added and deleted between two inserts, and here is my problem, since the entries are deleted before a new INSERT I cannot compare. How would you change the design or the solution? All ideas are welcome and since I am so fresh I would really appreciate if you could show me how the query could look like.
Do not remove all rows first. Remove only the ones that are removed (or event better, just mark them "inactive" as I suggest below). Query your DB first, to see what was there last time.
I would maintain additional column in your table called "inactive". It will be FALSE as default, and TRUE for removed files. Please keep in mind that as your file is uniquely identified by file+path+id renaming any file is indeed an operation of deleting the old one and creating the new one.
Removing things from DB is not a good idea, as you might always remove something by accident (bug in the code) and would not be able to get the data back.
Additional thing to do is adding the hash to your table. This way you will be able to check if the file was really changed. There is no need to re-add the file to the DB is it is not changed. See Getting a File's MD5 Checksum in Java for more info.
One way to achieve this is to implement auditing of your table. A common approach is to create a copy of the table where you are storing the folder contents and name that table using a convention to indicate it is storing audit information (eg. _AUD) . You then add additional columns to the AUD table, like "REV" (revision), "REV_TYPE" (inserted, deleted, modified). Whenever you insert, update or delete any rows from your main table, you insert a row into the AUD table to describe what you've done. Then you can find the operations associated with each revision by looking it up in the AUD table. A java framework that provides this feature is hibernate envers (http://hibernate.org/orm/envers/).
Here's the case: I am creating a batch script that runs daily, parsing logfiles and exporting the data to a database. The format of this file is basically
std_prop1;std_prop2;std_prop3;[opt_prop1;[opt_prop2;[opt_prop3;[..]]]
The standard properties map to a table with a column for each property, where each line in the logfile basically maps to a corresponding row. It might look like LOGDATA(id,timestamp,systemId,methodName,callLenght). Since we should be able to log as many optional properties as we like, we cannot map them to the same table, since that would mean adding a row the table every time a new property was introduced. Not to think of the number of NULL references ...
So the additional properties go in another table, say EXTRA_PROPS(logdata_foreign_key,propname,value). In reality, most of the optional properties are the same (e.g. os version, app container, etc), making it somewhat wasteful to log for instance 4 rows in EXTRA_PROPS for each row in LOGDATA (in the case that one on average had 4 extra properties). So what I would like my batch job to do is
for each additionalProperty in logRow:
see if additionalProperty already exist
if exists:
create a reference to it in a reference table
if not:
add the property to the extra properties table
create a reference to it in a reference table
I would then probably have three slightly different tables:
LOGDATA(id,timestamp,systemId,methodName,callLenght)
EXTRA_PROPS(id,propname,value)
LOGDATA_HAS_EXTRA_PROPS(logid,extra_prop_id)
I am not 100% this is a better way of doing it, I would still create N rows in the LOGDATA_HAS_EXTRA_PROPS table for N properties, but at least I would not add any new rows to EXTRA_PROPS.
Even if this might not be the best way (what is?), I am still wondering about the tecnhical side: How would I implement this using Hibernate? It does not have to be superfast, but it would need to chew through 100K+ rows.
Firstly, I would not recommend using Hibernate for this type of logic. Hibernate is a great product but doing this kind of high load data operations may not be it's strongest point.
From data modeling standpoint, it appears to me that (propname,value) is actually a primary key in EXTRA_PROPS. Basically, you want to express the logic that, for example, hostname + foo.bar.com combination will only appear once in the table. Am I right? That would be PK. So you will need to use that in LOGDATA_HAS_EXTRA_PROPS. Using name alone will not be sufficient for reference.
In Hibernate (if you choose to use it), that can be expressed via composite key using #EmbeddedId or Embeddable on object mapped to EXTRA_PROPS. And then you can have many to many relationship that uses LOGDATA_HAS_EXTRA_PROPS as association table.