Currently there are schema actions that let you recreate tables on each startup, but dropping them obviously means you lose all rows of that table.
In CQL you can make a query like
CREATE TABLE IF NOT EXISTS keyspace.tablename(....)
But I can't find any way of achieving a similar result with spring-data-casssandra, one that would let me start my app for the first time and on without changing anything.
Is there any way to create a table defined in a POJO with #Table ONLY if said table does not already exist?
See DATACASS-219.
I just recently added support for CREATE TABLE IF NOT EXISTS keyspace.tablename (..);. This will be available in SD Cassandra 1.5 M1 (Ingals). I'll consider backing porting this to 1.4 for the 1.4.2.RELEASE.
The only other way to accomplish this for the time being (if not using the 1.5.0.BUILD-SNAPSHOT containing the DATACASS-219 fix) is to set your SchemaAction to NONE and provide your own raw CQL, initialization scripts to the CassandraSessionFactoryBean using the setStartupScripts(:List) method.
Related
I am using Cassandra database integrated into a spring boot application.
My Question is around the schema actions. If I need to make structural changes to the DB, say add a column to a table, the database needs to be recreated, however this means all the existing data gets deleted:
schema-action: CREATE_IF_NOT_EXISTS
The only way I have managed to solve this is by using the RECREATE scheme action, but as mentioned earlier, this results in data-loss.
What would be the best approach to handle this? To add structural changes such as a column name with out having to recreate the database and lose all existing data?
Thanks
Cassandra does allow you to modify the schema of an existing table without recreating it from scratch, using the ALTER TABLE statement via cqlsh. However, as explained in that link, there are some important limitations on the kind of changes you can do. You cannot modify the primary key of the table at all, you can add or delete regular columns, and you can't change the type of a column to a non-compatible one.
The reason for most of these limitations is how Cassandra needs to deal with the old data that already exists in the table. For example, it doesn't make sense to say that a column A that until now contained strings - will now contain integers - how are we supposed to handle all the old values in column A which weren't integers?
As Aaron rightly said in a comment, it is unlikely you'll want to do these schema changes as part of your application. These are usually rare operations which are done manually, or via some management application - not your usual application.
i have some large data in one table and small data in other table,is there any way to run initial load of golden gate so that same data in both tables wont be changed and rest of the data got transferred from one table to other.
Initial loads are typically for when you are setting up the replication environment; however, you can do this as well on single tables. Everything in the Oracle database is driven by System Change Numbers/Change System Numbers (SCN/CSN).
By using the SCN/CSN, you can identify what the starting point in the table should be and start CDC from there. Any prior to the SCN/CSN will not get captured and would require you to manually move that data in some fashion. That can be done by using Oracle Data Pump (Export/Import).
Oracle GoldenGate also provided a parameter called SQLPredicate that allows you to use a "where" clause against a table. This is handy with initial load extracts because you would do something like TABLE ., SQLPredicate "as of ". Then data before that would be captured and moved to the target side for a replicat to apply into a table. You can reference that here:
https://www.dbasolved.com/2018/05/loading-tables-with-oracle-goldengate-and-rest-apis/
Official Oracle Doc: https://docs.oracle.com/en/middleware/goldengate/core/19.1/admin/loading-data-file-replicat-ma-19.1.html
On the replicat side, you would use HANDLECOLLISIONS to kick out any ducplicates. Then once the load is complete, remove it from the parameter file.
Lots of details, but I'm sure this is a good starting point for you.
That would require programming in java.
1) First you would read your database
2) Decide which data has to be added in which table on the basis of data that was read.
3) Execute update/ data entry queries to submit data to tables.
If you want to run Initial Load using GoldenGate:
Target tables should be empty
Data: Make certain that the target tables are empty. Otherwise, there
may be duplicate-row errors or conflicts between existing rows and
rows that are being loaded. Link to Oracle Documentations
If not empty, you have to treat conflicts. For instance if the row you are inserting already exists in the target table (INSERTROWEXISTS) you should discard it, if that's what you want to do. Link to Oracle Documentation
I am trying to Auto create SQL tables on first run, and haven't found any good tutorial on google. Does anyone have a suggestion or can explain it to me. My class so far looks like: http://pastebin.com/6u8yFWrt
In Mysql you can do:
CREATE TABLE tablename IF NOT EXISTS...
This will, as the name says, create the table if there is no table with the same name, described here. If you run it whenever you open the connection to the database, you should be fine.
BUT this is no guarantee that the existing table has the format you want! If you change the definition of the table halfway through the process, because you want an extra column, you will need to delete the existing table fist, for the changes in the query to have an effect.
I think you should use JPA and Hibernate instead. Youre probably in for quite a journey until you get the hold of it but I think it is worth the effort.
Here's the case: I am creating a batch script that runs daily, parsing logfiles and exporting the data to a database. The format of this file is basically
std_prop1;std_prop2;std_prop3;[opt_prop1;[opt_prop2;[opt_prop3;[..]]]
The standard properties map to a table with a column for each property, where each line in the logfile basically maps to a corresponding row. It might look like LOGDATA(id,timestamp,systemId,methodName,callLenght). Since we should be able to log as many optional properties as we like, we cannot map them to the same table, since that would mean adding a row the table every time a new property was introduced. Not to think of the number of NULL references ...
So the additional properties go in another table, say EXTRA_PROPS(logdata_foreign_key,propname,value). In reality, most of the optional properties are the same (e.g. os version, app container, etc), making it somewhat wasteful to log for instance 4 rows in EXTRA_PROPS for each row in LOGDATA (in the case that one on average had 4 extra properties). So what I would like my batch job to do is
for each additionalProperty in logRow:
see if additionalProperty already exist
if exists:
create a reference to it in a reference table
if not:
add the property to the extra properties table
create a reference to it in a reference table
I would then probably have three slightly different tables:
LOGDATA(id,timestamp,systemId,methodName,callLenght)
EXTRA_PROPS(id,propname,value)
LOGDATA_HAS_EXTRA_PROPS(logid,extra_prop_id)
I am not 100% this is a better way of doing it, I would still create N rows in the LOGDATA_HAS_EXTRA_PROPS table for N properties, but at least I would not add any new rows to EXTRA_PROPS.
Even if this might not be the best way (what is?), I am still wondering about the tecnhical side: How would I implement this using Hibernate? It does not have to be superfast, but it would need to chew through 100K+ rows.
Firstly, I would not recommend using Hibernate for this type of logic. Hibernate is a great product but doing this kind of high load data operations may not be it's strongest point.
From data modeling standpoint, it appears to me that (propname,value) is actually a primary key in EXTRA_PROPS. Basically, you want to express the logic that, for example, hostname + foo.bar.com combination will only appear once in the table. Am I right? That would be PK. So you will need to use that in LOGDATA_HAS_EXTRA_PROPS. Using name alone will not be sufficient for reference.
In Hibernate (if you choose to use it), that can be expressed via composite key using #EmbeddedId or Embeddable on object mapped to EXTRA_PROPS. And then you can have many to many relationship that uses LOGDATA_HAS_EXTRA_PROPS as association table.
I need the sample program in Java for keeping the history of table if user inserted, updated and deleted on that table. Can anybody help in this?
Thanks in advance.
If you are working with Hibernate you can use Envers to solve this problem.
You have two options for this:
Let the database handle this automatically using triggers. I don't know what database you're using but all of them support triggers that you can use for this.
Write code in your program that does something similar when inserting, updating and deleting a user.
Personally, I prefer the first option. It probably requires less maintenance. There may be multiple places where you update a user, all those places need the code to update the other table. Besides, in the database you have more options for specifying required values and integrity constraints.
Well, we normally have our own history tables which (mostly) look like the original table. Since most of our tables already have the creation date, modification date and the respective users, all we need to do is copy the dataset from the live table to the history table with a creation date of now().
We're using Hibernate so this could be done in an interceptor, but there may be other options as well, e.g. some database trigger executing a script, etc.
How is this a Java question?
This should be moved in Database section.
You need to create a history table. Then create database triggers on the original table for "create or replace trigger before insert or update or delete on table for each row ...."
I think this can be achieved by creating a trigger in the sql-server.
you can create the TRIGGER as follows:
Syntax:
CREATE TRIGGER trigger_name
{BEFORE | AFTER } {INSERT | UPDATE |
DELETE } ON table_name FOR EACH ROW
triggered_statement
you'll have to create 2 triggers one for before the operation is performed and another after the operation is performed.
otherwise it can be achieved through code also but it would be a bit tedious for the code to handle in case of batch processes.
You should try using triggers. You can have a separate table (exact replica of your table of which you need to maintain history) .
This table will then be updated by trigger after every insert/update/delete on your main table.
Then you can write your java code to get these changes from the second history table.
I think you can use the redo log of your underlying database to keep track of the operation performed. Is there any particular reason to go for the program?
You could try creating say a List of the objects from the table (Assuming you have objects for the data). Which will allow you to loop through the list and compare to the current data in the table? You will then be able to see if any changes occurred.
You can even create another list with a object that contains an enumerator that gives you the action (DELETE, UPDATE, CREATE) along with the new data.
Haven't done this before, just a idea.
Like #Ashish mentioned, triggers can be used to insert into a seperate table - this is commonly referred as Audit-Trail table or audit log table.
Below are columns generally defined in such audit trail table : 'Action' (insert,update,delete) , tablename (table into which it was inserted/deleted/updated), key (primary key of that table on need basis) , timestamp (the time at which this action was done)
It is better to audit-log after the entire transaction is through. If not, in case of exception being passed back to code-side, seperate call to update audit tables will be needed. Hope this helps.
If you are talking about db tables you may use either triggers in db or add some extra code within your application - probably using aspects. If you are using JPA you may use entity listeners or perform some extra logic adding some aspect to your DAO object and apply specific aspect to all DAOs which perform CRUD on entities that needs to sustain historical data. If your DAO object is stateless bean you may use Interceptor to achive that in other case use java proxy functionality, cglib or other lib that may provide aspect functionality for you. If you are using Spring instead of EJB you may advise your DAOs within application context config file.
Triggers are not suggestable, when I stored my audit data in file else I didn't use the database...my suggestion is create table "AUDIT" and write java code with help of servlets and store the data in file or DB or another DB also ...