Web service with multi tenant in spring: postgres vs mongodb

Web service with multi tenant in spring: postgres vs mongodb - java

I would like to create a simple project using spring to control the status of some customers, with different environments. So a customer can have two environments (dev and prod), and others may have one, two or three.
The basic idea is I would like to create a Web Service using spring with the following interface:
localhost:8080/customer1/environment1/status to extract status data from customer1 and environment1.
I have two options:
Using MongoDB, with a database per customer, a collection per environment and inside the status documents. I found the following problems:
I found many solutions on the web, it was for previous versions of Spring (I am using Spring 5)
Also, I am not sure how can I implement dynamic collections (I mean, if I make a request to localhost:8080/customer2/environment2/status, I not only would like to change the database but also the collection dynamically)
Using Postgres, using a schema per customer, and a table per environment (all the tables will have the same structure)
The problem is that the table name can be different (production, development, test and so on), so I should have to implement dynamic tables name in Spring (which I am not sure if it is possible)
I have been searching a couple of days for an easy solution for this (which initially I thought it would be easy, but looks like it is not that easy)
What do you think it would be the best and simpler solution: MongoDB or Postgres?
Can you provide the basics steps to reproduce it, or provide a Github repository with code I could use as a reference?
PS: There is no need to be extra safe because it will be an internal service, so it doesn't matter the location of the customer's data: can be in the same database, or in different databases

First of all, I think your database choice should depends more on which advantages or disadvantages give you one database over the other. Second, I dont believe using a database per user its a good idea, imagine what will happen when you get 5000 users, it will be a pain administrate such amount of databases or keep changing your database every time in your code. I suggest you firstly try to get a compressed database model of your requeriments in a single database and then over that, you can work and select wich database is better for you.
I hope it helps!

Related

Hibernate + MySQL Best practices for reporting data

I am creating a webapp in Spring Boot (Spring + Hibernate + MySQL).
I have already created all the CRUD operations for the data of my app, and now I need to process the data and create reports.
As per the complexity of these reports, I will create some summary or pre proccesed tables. This way, I can trigger the reports creation once, and then get them efficiently.
My doubt is if I should build all the reports in Java or in Stored Procedures in MySQL.
Pros of doing it in Java:
More logging
More control of the structures (entities, maps, list, etc)
Catching exceptions
If I change my db engine (it would not happen, but never know)
Cons of doing it in Java:
Maybe memory?
Any thoughts on this?
Thanks!

Java. Though both are possible. It depends on what is most important and what skills are available for maintenance and the price of maintaining. Stored procedures are usually very fast, but availability and performance also depends on what exact database you use. You will need special skills, and then you have it all working on that specific database.
Hibernate does come with a special dialect written for every database to get the best performance out of the persistence layer. It’s not that fast as a stored procedure, but it comes pretty close. With Spring Data on top of that, all difficulty is gone. Maintenance will not cost that much and people who know Spring Data are more available than any special database vendor.
You can still create various “difficult” queries easily with HQL, so no block there. But Hibernate comes with more possibilities. You can have your caching done by eh-cache and with Hibernate envers you will have your audit done in no time. That’s the nice thing about this framework. It’s widely used and many free to use maven dependencies are there for the taking. And if in future you want to change your database, you can do it by changing like 3 parameters in your application.properties file when using Spring Data.
You can play with some annotations and see what performs better. For example you have the #Inheritance annotation where you can have some classes end up in the same table or split it to more tables. Also you have the #MappedSuperclass where you can have one JpaObject with the id which all your entities can extend. If you want some more tricks on JPA, maybe check this post with my answer on how to use a superclass and a general repository.

As per the complexity of these reports, I will create some summary or
pre proccesed tables. This way, I can trigger the reports creation
once, and then get them efficiently.
My first thought is, is this required? It seems like adding complexity to the application that perhaps isn't needed. Premature optimisation and all that. Try writing the reports in SQL and running an execution plan. If it's good enough, you have less code to maintain and no added batch jobs to administer. Consider load testing using E.G. jmeter or gatling to see how it holds up under stress.
Consider using querydsl or jooq for reporting. Both provide a database abstraction layer and fluent API for querying databases, which deliver the benefits listed in the "Pros of doing it in Java" section of the question and may be more suited to the problem. This blog post jOOQ vs. Hibernate: When to Choose Which is well worth a read.

Copying a database JPA

I have a working code that basically copies records from one database to another one using JPA. It works fine but it takes a while, so I wonder if there's any faster way to do this.
I thought Threads, but I get into race conditions and synchronizing those pieces of the code end up being as long as the one by one process.
Any ideas?
Update
Here's the scenario:
Application (Core) has a database.
Plugins have default data (same structure as Core, but with different data)
When the plugin is enabled it checks in the Core database and if not found it copies from it's default data into the core database.

Most databases provide native tools to support this. Unless you need to write additional custom logic to transform the data in some way, I would recommend looking at the export/import tools provided by your database vendor.

Providing database access via java webservice

Our company is currently implementing a couple of tools for employee use, as i'm the only programmer within the company its fallen to me to develop these tools.
However i have little to no experience with webservices or java, so im a little stumped on some logic here. and hoping someone can give me some guidance
We have a mysql database hosted in the UK, this will provide the data for the tools that will be used both within the UK and outside of the UK by our other offices. I'm looking to provide access to the database via web services.
However having looked into this, I get the feeling i have missed something key. Right now I'm looking to create methods for every database table, so each table will need a select, update and delete method, since there are 20 odd tables, that means the web service would have 60 methods exposed!, is this normal?
It seems to me that there would be an easier way to do this but having little experience with java i'm at a loss, and my google fu has failed me thus far.
Could anyone give me some pointers on what the "usual" way of doing this is? and if there is some way that I've simply overlooked.

Web services should be written for each entity and not for each table. An entity should be a logical one and not simply something very abstract. There can be multiple tables in your database to store the data for one entity. For example: You have an entity called 'Person' but assume that details of the person are stored in multiple tables such as 'PersonDetail', 'PersonContactDetails','PersonDependentDetails', etc. You can manipulate these tables data using webservices created for 'Person'.
Web services operations can be mapped to database CRUD(CREATE,READ,UPDATE,DELETE) operations. If you are writing RESTful webservices CRUD operations can be mapped to HTTP methods i.e. POST,GET,PUT,DELETE.

Here's one typical approach, although it's a pretty big learning curve:
Create Data Access Objects (DAOs) to query the DB and convert from your relational data model to a java object model. If extreme performance isn't a consideration (it isn't a consideration for most applications), consider ORM mapping frameworks like Hibernate or JPA. You probably don't need one method per table. Many times multiple tables make up one domain object. For instance, in a banking app you might have a table called customer, and a related table called customer_balance. If you just want to present a balance to a customer, you could have one domain object called "Customer", with a field called "balance". Your Customer DAO would join customer and customer_balance to create a single Customer object.
Create services to wrap DAOs and apply your business rules to them. Keep biz rules in the service as much as possible because it improves testability. An example of a simple banking service method would be "withdrawMoney(amount)". The service would pull the Customer from the DB via a DAO, then first check that the custom has at least "amount" in current balance, and then subtract "amount" from the current balance and save it in the database via the DAO.
Your web layer will call the services layer and present the data to the user and allow them to operate on it. At some point, you may want your web layer to communicate with the services layer via a web service API, although that is probably overkill for early implementations.
As others have cited, the Java Petstore application is a good example of this approach. Oracle doesn't maintain the Petstore app any longer, but volunteers have copied it to GitHub and are keeping it up to date with the latest J2ee versions. Here's a link to the GitHub site: https://github.com/agoncal/agoncal-application-petstore-ee6

Yes, if every one of your 20 tables will require selection (HTTP GET), update (HTTP PUT) and delete (HTTP DELETE), you will probably need 20*3=60 methods.

You'll probably want to start off by having a read of this part of the Java EE 7 tutorial which will give you an overview of web service development. What you are suggesting though seem strange and perhaps not really what you want. If you want to expose every table to updates / deletes / etc then you'd perhaps be better off just opening the port to the database server but this is generally considered a bad idea.
I think you probably want to work at a higher level and pass around objects rather than database updates, lets say, for example you have a Person object in your application. You can pass that to and from your web application and client application and let the web application worry about putting it in the database, deleting it etc. Although there is nothing technically wrong with performing updates in the way you are suggesting I've not seen it done for many years.

Hibernate and Multi-Tenant Database using Schemas in PostgreSQL

Background
I am working on a future multi-tenant web application that will need to support thousands of users. The app is being built on top of the Java based Play! MVC Framework using JPA/Hibernate and postgreSQL.
I watched Guy Naor's talk on Writing Multi-tenant Applications in Rails in which he talks about a few approaches to multi-tenancy (data isolation decreases as you go down the list):
Each customer has a separate database
One database with separate schemas and tables (table namespaces) for each customer.
One database with 1 set of tables with customer id columns.
I settled on approach #2, where a user id of some sort is parsed out of a request and then used to access that users tablespace. A postgres SET search_path TO customer_schema,public command is given before any query is made to make sure the customer's tables are the target of a query. This is easily done with #Before controller annotations in controller methods in Play! (this is the approach Guy used in his rails example). The search_path in postgres acts exactly like the $PATH does in an OS; awesome!
All this sounded great, but I immediately ran into difficulties in implementing it on top of a JDBC/Hibernate/JPA stack because there doesn't seem to be a way to dynamically switch schemas at runtime.
The Problem
How do I get either JDBC or Hibernate to support dynamically switching postgres schemas at runtime?
It seems database connections are statically configured by a connection factory (see: How to manage many schemas on one database using hibernate). I have found similar questions with similar answers of using multiple SessionFactorys per user, but since I understand SessionFactorys are heavy weight objects so it's implausible that you could support hundreds of users, let alone thousands of users, going this route.
I haven't committed myself completely to approach #2 above, but I haven't quite abandoned it for approach #3 quite yet either.

You can execute the command
SET search_path TO customer_schema,public
as often as you need to, within the same connection / session / transaction. It is just another command like SELECT 1;. More in the manual here.
Of course, you can also preset the search_path per user.
ALTER ROLE foo SET search_path=foo, public;
If every user or many of them have a schema that matches their user name, you can simply go with the default setting in postgresql.conf:
search_path="$user",public;
More ways to set the search_path here:
How does the search_path influence identifier resolution and the "current schema"

As of Hibernate 4.0, multi-tenancy is natively supported at the discriminator (customerID), schema, and database level. See the source code here, and the unit test here.
The difficulty is that, while the unit test's file name is SchemaBasedMultitenancyTest, the actual MultitenancyStrategy used is Database. I can't find any examples on how to make it work based on schema, but maybe the unit test will be enough to go on...

While sharding by schema is common, see this post from the Apartment gem authors covering some drawbacks.
At Citus, we shard via option #3 listed above, and you can read more in our use-case guide in the Documentation.

Pattern / Framework for lazy population of a database from remote source

My application pulls a large amount of data from an external source - normally across the internet - and stores it locally on the application server.
Currently, as a user starts a new project we aggressively try to pull the data from the external source based on the order that we predict the user will want to access it. This process can take 2 - 3 hours.
It seems like a smarter approach here is to provide access to the data in a lazy loading style fashion. Eg - If a user wants to access entity A, try to grab it from our database first. If it's not yet there, fetch it from the remote source and populate the database at the same time.
This, combined with continuing to populate the database in the background, would give a much slicker experience for the user.
Are there frameworks which manage this level of abstraction? (My application is in Java).
There's several considerations here - Ie., Currently my database enforces relational integrity - something that might have to be turned off to facilitate this lazy loading approach. Concurrency seems like it would cause problems here.
Also, it seems like entities and collections could exist in a partially populated state - this requires additional schema data to distinguish the complete from the partially populated.
As I understand it, this is just an aggregated repository pattern - is this correct, or is this a more appropriate pattern I should study?

Have you tried JPA/Hibernate? This seems easily possible in Hibernate.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.