I have a multi-node production server (Tomcat 8.x on CentOS 7.x; each node is a separate CentOS instance), that uses a single MySQL database server (MySQL 5.7.x). Each node of the server will be updated manually: system administrator stops each node and deploys a new version of the application (.war file). It means that the service won't have downtime, because at every moment there is at least one working node.
Database migrations are implemented using Liquibase changesets, which are placed in the .war file. So each node validates and updates (if requires) the database schema. Actually, only the first node executes changesets and other nodes just validate it.
The problem is that there is a time gap between updates of each node: when the first node is already updated with the new application version, the last node still works on the previous application version (that might use old columns in database for example). It might lead to inconsistency of the database.
Example
Let's say the server has 3 nodes. At this moment they work on an application of version N.
Next releases need to change a database schema: rename a column title to title_new.
To make it possible to update database schema without downtime, we need to use "two-steps change":
version N+1:
adds a new column title_new,
doesn't use a column title anymore (it's marked as deprecated);
copy all data from the column title to title_new;
uses a column title_new;
version N+2 drops a column title.
Now administrator is going to deploy version N+1. He stops the first node for update, but the other two nodes are still working on the version N. While the first node is updating, some users might change their data using node 2 or 3 (there is a load balancer, that routes requests to different nodes). So we need a way to forbid users to make any changes via nodes 2 and 3, while they are not updated with a version N+1.
I see two different ways to solve this problem:
Use some read_only mode on the application level - then the application logic forbids users to make any changes. But then we need to implement some ways to enable this mode at any time using a console or admin panel (administrator must be allowed to enable this mode).
Use read_only mode on database level. But I couldn't find any ready-for-use methods for MySQL to do this.
The question: what's the best way to solve the described issue?
P.S. Application is based on the Spring 4.x framework + Hibernate 4.x.
An alternative way of solving this may be: "using database trigger":
version N+1 :
for every renamed column create a trigger that copy data inserted/updated in title to title_new (see here)
version N+2 :
Drop the trigger, drop the old column
The advantage of this approach are:
it can be done completely with liquibase (don't need additional steps for the administrator)
all your nodes remains fully functional (no read-only)
The drawbacks :
you must write/use triggers
can be tricky if your db updates are more complex (like column renamed + new db-constraints)
Zero Downtime Deployment with a Database
I found the above article to be very insightful as to the various options for database migrations without downtime.
Related
I have a Java application that loads data from a very large text file into MySQL via Hibernate. So far, the application seems to work OK - the data gets loaded.
Then I wanted to monitor the progress of the application while it was inserting data, so I wrote a small command-line utility that basically queries select count(*) from my_table;.
The first time this query is run (from either the CLI or MySQL Workbench), I get the correct number of records, as expected. All subsequent executions of this query return the exact same number! Even though the data-loading application is still running!
If I stop and start the MySQL process, querying for the number of records shows the correct number, as the data-loading application would report it.
I've never seen anything like this before. It looks like there is some strange MySQL caching issue going on here, and I'm concerned it may cause problems for other non-Hibernate applications that may want to access this database.
Does anyone know how to tweak MySQL (or possibly Hibernate) so that MySQL always shows what's being added to the database?
Technical details:
MySQL: 5.7.26, in a docker container, using InnoDB storage
Hibernate version: 5.4.2.Final
Spring version: 5.1.7.RELEASE
Calling FLUSH TABLES seems to resolve this, in the sense that after I flush the tables, I can see how many records have been added by the application.
I have been given a java web application for which I have the source code to. The application queries an Oracle database to return data back to the user in web page. I need to update a value for the returned data in the database, without knowing the table or column names. Is there a general way to determine what query the application is submitting to return the data for a particular page so I can find the right tables and columns?
I am new to java and web development so not sure where to start looking.
Thanks!
Well, there's always the old fashioned way of finding out. You can find the source code for the specific page you're looking at and identify the query that's being executed to retrieve the data. I'm assuming that's not what you're looking for, though.
Some other options include using JDBC (Enabling and Using JDBC Logging) logging feature or JProfiler (the JDBC probe shows you all SQL statements in the events view). Once you find the SQL statement, you can use standard text search features within your IDE to locate the specific code and make alterations.
Hope that helps!
If you can run a controlled test (e.g., you are the only person on that web application), you could turn on SQL tracing on the DB connection and then run your transaction several times. To do this
look at all the connections from that application using v$session -- you can control this by tweaking your connection pool setting (e.g., set min and max connection to 1). Assuming this is your test environment.
turn on 10046 trace (see https://oracle-base.com/articles/misc/sql-trace-10046-trcsess-and-tkprof -- there are many other examples).
The 10046 trace will show you what the application is doing -- SQL by SQL. You can even set the level to 12 to get the bind variable values (assuming you are using prepared statements).
I am working on an application where we have decided to go for a multi-tenant architecture using the solution provided by Spring, so we route the data to each datasource depending on the value of a parameter. Let's say this parameter is a number from 1 to 10, depending on our clients id.
However, this requires altering the application-context each time we add a new datasource, so to start we have thought on the following solution:
Start with 10 datasources (or more) pointing to different IPs and the same schema, but in the end all routed to the same physical database. No matter the datasource we use, the data will be sent to the same schema in this first scenario.
The data would be in the same schema, so the same table would be shared among datasources, but each row would only be visible to each datasource (using a fixed where clause in every CRUD operation)
When we have performance problems, we will create another database, migrate some clients to the new schema, and reroute the IP of one of the datasources to the new database, so this new database gets part of the load of the old one
Are there any drawbacks with this approach? I am concerned about:
ACID properties lost
Problems with hibernate sessionFactory and second level cache
Table locking issues
We are using Spring 3.1, Hibernate 4.1 and MySQL 5.5
i think your spring-link is a little outdated, hibernate 4 can handle multi-tenancy pretty well on it's own. i would suggest to use the multiple schemas approach because setting up and initializing a new schema is programmatically relativly easy to do (for example on registration-time), if you have so much load though (and your database-vendor does not provide a solution to make this transparent to your application) you need the multiple database approach, you should try to incorporate the tenant-id in the database-url or something in that case http://docs.jboss.org/hibernate/orm/4.1/devguide/en-US/html/ch16.html
I want to know if it is possible to use the same index file for an entity in two applications. Let me be more specific:
We have an online Application with a frondend for the users and an application for the backend tasks (= administrator interface). Both are running on the same JBOSS AS. Both Applications are using the same database, so they are using the same entities. Of course the package names are not the same in both applications for the entities.
So this is our usecase: A user should be able to search via the frondend. The user is only allowed to see results which are tagged with "visible". This tagging happens in our admin interface, so the index for the frontend should be updated every time an entity is tagged as "visible" in the backend.
Of course both applications do have the same index root folder. In my index folder there are 2 index files:
de.x.x.admin.model.Product
de.x.x.frondend.model.Product
How to "merge" this via Hibernate Search Configuration? I just did not get it via the documentation...
Thanks for any help!
Ok, it seems, that this is not possible...
I have a scenario where the unit of work is defined as:
Update table T1 in database server S1
Update table T2 in database server S2
And I want the above unit of work to happen either completely or none at all (as the case with any database transaction). How can I do this? I searched extensively and found this post close to what I am expecting but this seems to be very specific to Hibernate.
I am using Spring, iBatis and Tomcat (6.x) as the container.
It really depends on how robust a solution you need. The minimal level of reliability on such a thing is XA transactions. To use that, you need a database and JDBC driver that supports it for starters, then you could configure Spring to use it (here is an outline).
If XA isn't robust enough for you (XA has failure scenarios, such as if something goes wrong in the second phase of commits, such as a hardware failure) then what you really need to do is put all the data in one database and then have a separate process propagate it. So the data may be inconsistent, but it is recoverable.
Edit: What I mean is that put the whole of the data into one database. Either the first database, or a different database for this purpose. This database would essentially become a queue from which the final data view is fed. The write to that database (assuming a decent database product) will be complete, or fail completely. Then, a separate thread would poll that database and distribute any missing data to the other databases. So if the process should fail, when that thread starts up again it will continue the distribution process. The data may not exist in every place you want it to right away, but nothing would get lost.
You want a distributed transaction manager. I like using Atomikos which can be run within a JVM.