Implementing multi-tenancy for a mature enterprise application

Implementing multi-tenancy for a mature enterprise application - java

I've been tasked with making an enterprise application multi-tenant. It has a Java/Glassfish BLL using SOAP web services and a PostgreSQL backend. Each tenant has its own database, so (in my case at least) "multi-tenant" means supporting multiple databases per application server.
The current single-tenant appserver initializes a C3P0 connection pool with a connection string that it gets from a config file. My thinking is that now there will need to be one connection pool per client/database serviced by the appserver.
Once a user is logged in, I can map it to the right connection pool by looking up its tenant. My main issue is how to get this far - when a user is first logged in, the backend's User table is queried and the corresponding User object is served up. It seems I will need to know which database to use with only a username to work with.
My only decent idea is that there will need to be a "config" database - a centralized database for managing tenant information such as connection strings. The BLL can query this database for enough information to initialize the necessary connection pools. But since I only have a username to work with, it seems I would need a centralized username lookup as well, in other words a UserName table with a foreign key to the Tenant table.
This is where my design plan starts to smell, giving me doubts. Now I would have user information in two separate databases, which would need to be maintained synchronously (user additions, updates, and deletions). Additionally, usernames would now have to be globally unique, whereas before they only needed to be unique per tenant.
I strongly suspect I'm reinventing the wheel, or that there is at least a better architecture possible. I have never done this kind of thing before, nor has anyone on my team, hence our ignorance. Unfortunately the application makes little use of existing technologies (the ORM was home-rolled for example), so our path may be a hard one.
I'm asking for the following:
Criticism of my existing design plan, and suggestions for improving or reworking the architecture.
Recommendations of existing technologies that provide a solution to this issue. I'm hoping for something that can be easily plugged in late in the game, though this may be unrealistic. I've read about jspirit, but have found little information on it - any feedback on it or other frameworks will be helpful.
UPDATE: The solution has been successfully implemented and deployed, and has passed initial testing. Thanks to #mikera for his helpful and reassuring answer!

Some quick thoughts:
You will definitely need some form of shared user management index (otherwise you can't associate a client login with the right target database instance). However I would suggest making this very lightweight, and only using it for initial login. Your User object can still be pulled from the client-specific database once you have determined which database this is.
You can make the primary key [clientID, username] so that usernames don't need to be unique across clients.
Apart from this thin user index layer, I would keep the majority of the user information where it is in the client-specific databases. Refactoring this right now will probably be too disruptive, you should get the basic multi-tenant capability working first.
You will need to keep the shared index in sync with the individual client databases. But I don't think that should be too difficult. You can also "test" the synchronisation and correct any errors with an batch job, which can be run overnight or by your DBA on demand if anything ever gets out of sync. I'd treat the client databases as the master, and use this to rebuild the shared user index on demand.
Over time you can refactor towards a fully shared user management layer (and even in the end fully shared client databases if you like. But save this for a future iteration.....

Related

Java web - sessions design for horizontal scalability

I'm definitely not an expert Java coder, I need to implement sessions in my java Servlet based web application, and as far as I know this is normally done through HttpSession. However this method stores the session data in the local filesystem and I don't want to have such constraints to horizontal scalability. Therefore I thought to save sessions in an external database to which the application communicates through a REST interface.
Basically in my application there are users performing some actions such as searches. Therefore what I'm going to persist in sessions is essentialy the login data, and the meta data associated to searches.
As the main data storage I'm planning to use a graph noSQL database, the question is: let's say I can eventually also use another database of another kind for sessions, which architecture fits better for this kind of situation?
I currently thought to two possible ways. the first one uses another db (such as an SQL db) to store sessions data. In this way I would have a more distributed workload since I'm not using the main storage also for sessions. Moreover I'd also have a more organized environment being session state variables and persisten ones not mixed up.
The second way instead consists in storing every information relative to any session into the "user node" of the main database. The sessionid will be at this point just a "shortcut" for an authentication. This way I dont have to rely on a second database, however I move all the workload to the main db mixing the session data with the persistent ones.
is there any standard general architecture to which I can ake reference? DO I miss some important point which should constraint my architecture?

Your idea to store sessions in a different location is good. How about using an in-memory cache like memcached or redis? Session data is generally not long-lived so you have other options other than a full-blown database. Memcached & Redis can both be clustered and can scale horizontally.

Multiple independent H2 databases within one JVM

Is it possible to start up and shut down multiple H2 databases within a JVM?
My goal is to support multi-tenancy by giving each user/account their own database. Each account has very little data. Data between the accounts is never accessed together, compared, or grouped; each account is entirely separate from the others. Each account is only accessed briefly once a day or a few times a month. So there are few upsides to housing the data together in a single database, and some serious downsides.
So my idea is that when a user logs in for a particular account, that account’s database is loaded. When that user logs out, or their web app session (Vaadin app) times out, that account’s database is closed, it's data flushed to storage, and possibly a backup performed. This opening and closing would be happening for any number of databases in parallel.
Benefits include minimizing the amount of memory in use at any one time for caching data and indexes, minimizing locking and other contention, and allowing for smooth scaling.
I'm new to H2, so I'm not sure if its architecture can support this. I'm asking for a denial or confirmation of this capability, along with any tips or caveats.

Yes it is possible to do so. Each database will contain its own mini environment, no possible pollution between databases.
You could for example use a jdbc url based on the user id or login from the user:
jdbc:h2:user1 in H2 1.3.x embedded mode
jdbc:h2:./user1 in H2 1.4.x embedded mode
jdbc:h2:tcp://localhost/user1 in tcp mode
You can use any naming convention for the database name, provided your OS allows it: user1, user2, etc... or truly the name of the login.
Tips:
use the server mode rather than the embedded mode, allowing for same user multiple connections from multiple sessions/hosts
have a schema migrator (like flyway) to initialize each newly created db
ensure you manage name collisions at the top level of your app, and possibly store these databases and corresponding logins in a dedicated database as well
Caveats:
do not use a connection pool as connections will be difficult to reuse
You must make sure IFEXISTS=TRUE is not used on the server
avoid using tweaks on the jdbc url, like turning LOG=0, UNDO_LOG=0, etc...
I do not know if you'll have a limitation from your OS or the JVM on how many db files could be opened like this.
I do not know if such setting can be tweaked from the manual pages. I could not find one.
Please refer to H2 manual in doubts of url parameters.

How can i securely access a web based MYSQL database from an Android App

I have a SQL databse on the internet which has information
I need my Android app to be able to access that information
The app needs to know the username and password of the database
How can it know?
If i code it in, anyone can get it

In general, databases should not be publicly accessible, nor should they be directly accessed by a user application, for several very good reasons:
There is generally no easy way to implement row-level access control. Views and triggers can only get you so far - in general application-level users do not map well to database users, since the latter usually have access to far more data than the former should have.
The DB clients are tied to the actual database schema. Having clients not under your control like, say, an Android application is a very good way to tie yourself up in ways that would disallow any and all future development.
Having a DB port open to the world is not considered by any means secure. Any potential security hole would give straight access to all of your data. The MySQL security guidelines explicitly warn against opening the DB port to the internet.
There is no way to protect the DB credentials or the data from a sufficiently determined and knowledgeable user. If your application can access something, so can they.
Database access protocols are mostly designed with local-area networks in mind, rather than the inherently unreliable nature of the Internet. Even encryption and security are often more of an afterthought...
The standard way to approach this issue is to create an intermediate web service with separate user accounts and a restricted set of operations on the data. The web service would let each user access only the data that relate to them, and even that indirectly. This approach separates the data from the user application layer, allows you the flexibility of storing and accessing your data however you wish and provides an additional layer of security for your DB.

Replicate modified data on different database

I would like to ask for an starting point of what technology or framework to research.
What I need to accomplish is the following:
We have a Java EE 6 application using JPA for persistance; we would like to use a primary database as some sort of scratchpad, where users can insert/delete records according to the tasks they are are given. Then, at the end of the day an administrator will do some kind of check on their work approving or disapproving it. If he approves the work, all changes will be done permanent and the primary database will be synced - replicated to another one (for security reasons). Otherwise, if administrator do not approve changes they will be rolled back.
Now here I got two problems to figure out:
First.- Is it possible to rollback a bunch of JPA operations done through a certain amount of time?
Second.- Trigger the replication (This can be done by RDBMS engines) process by code.
Now, if RDBMS replication is not possible (maybe because of client requirement) we would need a sync framework for JPA as a backup. I was looking at some JMS solutions, however not clear about the exact process or how to make them work on JPA.
Any help would be greatly appreciated,
Thanks.

I think, your design steps are having too much risk on loosing data. What I understand that you are talking about holding data in memory until admin approves/reject it. You must think about a disaster scenario and saving your data in that case.
Rather this problem statement is more inclined towards a workflow design, where the
data is entered by one entity, it is persisted.
Other entity approve/> reject the data.
All the approved data is further replicated to next database.
All these three steps could be implemented in 3 modules, backed by a persistent storage/ JMS technology. Depending on how real time, each of these steps needs to be; you could think of an elegant design to accomplish this in a cost effective manner.

Add a "workflow state" column to your table. States: Wait for approval, approved, replicated
Persist your data normally using JPA (state: wait for approval)
Approver approves: Update using JPA, change to approved state
As for the replication
In the approve method you could replicate the data synchronously to the other database (using JPA)
You could copy as well the approved data to another table, and use some RDBMS functionality to have the RDBMS replicate the data of that table
You could as well send a JMS message. At the end of the day a job reads the queue and persists the data into the other database
Anyway I suggest using a normal RDBMS cluster with synchronous replication. In that scenario you don't have to develop a self-made replication scheme, and you always have a copy of your data. You always have the workflow state.

Convert single-user Apache Derby database to one with users

I have an Apache Derby database that, until now, has always been locally accessed. It needs to be accessed by multiple computers now, so I feel it ought to have a username/password.
How do I take the existing database and retroactively add a user
How do I provide local/network authentication for that user?
I recall looking through their docs a few years ago, and it seem like there was a lot left to a developer to implement in these cases.
To clarify more, regarding point #1, this page says:
Attention: There is currently no way of changing the database owner once the database is created. This means that if you plan to run with SQL authorization enabled, you should make sure to create the database as the user you want to be the owner.
I think this means that I will probably have to create a new database with a named user, and migrate all date from the original single-user database to the new one. Is this correct? Is there an easier way?
Also regarding question number two, the manual says
Important: Derby's built-in authentication mechanism is suitable only for development and testing purposes. It is strongly recommended that production systems rely on an external directory service such as LDAP or a user-defined class for authentication.
Which, to me, says that the builtin authentication isn't worth using. There's no way we're going to go to an LDAP integration either, so is there something in-between these two that is worth using?

Since you mention you're going from a single-user environment to a multi-user environment, you're probably going to be setting up the Network Server, so you will have two levels of security to consider: database authentication, and network server authentication.
You probably want to start here: http://db.apache.org/derby/docs/10.8/adminguide/cadminapps49914.html
and here:
http://db.apache.org/derby/docs/10.8/devguide/cdevcsecure42374.html

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.