Spring Batch multiple different sources reading pattern

Spring Batch multiple different sources reading pattern - java

I am trying to found the best design pattern to read multiple different sources, manipulate them and write into a file.
So I have an LDAP server with a branch containing Users and another branch containing Services. An User can have multiples services (represented with multi valued field on User) and I need services details on each User. I also need to log how many Users and Services have been found in the LDAP. First question : any Ldap reader available from spring batch infrastructure ?
I also have a database containing user extras informations in multiples tables (ex: users extras, options, ...). LDAP users can have DB extra informations or not (associated by user id).
I also need to log how many users extras informations have been found in database and log orphan informations (that is not related with a ldap user).
Finally, I need to merge this differents datas into one file with a header (merge LDAP and DB informations).
What is the best pattern to perform that ? (reader/processor/writer pattern seems to be unappropriate to summarize results like Driving query based itemReaders...)

Related

Secure MySQL connection in Java?

I try this:
Connection con = DriverManager.getConnection("jdbc:mysql://localhost:3306/sonoo","root","password");
but it's very easy for someone to hack strings of username and password.
Opening Application with zip, winrar or any else program look like this and read code.
How can I secure my connection?

You need to decide what permissions someone who gets a copy of your JAR has. Do they have permission to run database queries or not?
If they should not: delete the database connection. They don't have permission.
If they should: then they can have the password. They have permission.
What seems to be tripping you up is that you are giving out the root password for your database, and so you want the the third option: "They should be able to do some database queries, but not others."
The JAR file is the wrong place to try to solve that problem. If you try to solve this at the JAR file level, one of two things will happen. Either your users were trustworthy all along and you wasted your time with whatever elaborate scheme you used, or some of your end-users are untrustworthy and one of them will hack you. They will hack you by stepping it through the JVM and editing your query strings right before the JVM sends them out, at the very last second, if they absolutely have to. Everything you do at this level will be security theater, like getting frisked at the airport, it doesn't make you significantly safer but there is a tiny chance that you can say "but we encrypted it!" and your clients might not dump you after the inevitable security breach.
That problem needs to be solved within the database, by creating a user account which does not have the permissions that they should not have. When you do SHOW GRANTS FOR enduser#'%' it will show you only the sorts of queries that they are allowed to do.
In many cases you want to give the user account a more fine-grained permission than just INSERT, SELECT, or UPDATE on a table. For example, you might have the logic "you can add to this table, but only if you also update the numbers in this other table." For these, you should use stored procedures, which can have their permissions set to either "definer" or "invoker": define it by a user with the appropriate permissions and then the invoker gets to have advanced permissions to do this particular query.
In some cases you have a nasty situation where you want to distribute the same application to two different clients, but they would both benefit significantly (at the expense of the other!) from being able to read each other's data. For example you might be an order processor dealing with two rival companies; either one would love to see the order history of the other one. For these cases you have a few options:
Move even SELECT statements into stored procedures. A stored procedure can call user() which can still give you the logged-in user, even though they are not the definer.
Move the database queries out of the shared JAR. Like #g-lulu says above, you can create a web API which you lock down really well, or something like that.
Duplicate the database, move the authentication parameters to a separate file which you read on startup.
Option 3 requires you to write tooling to maintain multiple databases as perfect duplicates of each other's structure, which sucks. However it has the nice benefit over (1) and (2) that a shared database inevitably leaks some information -- for example an auto_increment ID column could leak how many orders are being created globally and there might be ways to determine something like, "oh, they send all of their orders through this unusual table access, so that will also bump this ID at the same time, so I just need to check to see if both IDs are bumped and that'll reveal an order for our rival company".

You can create a webservice in PHP (or java or others). This webservice is stocked on a server and he's contain access and query to your database.
With your desktop app, just send a request (POST, GET) to your web service.
Exemple in PHP webservice :
if (isset($_POST['getMember'])){
do a query in your database
insert result into JSON
return JSON
}

Retrieve information for the same DTO from two different databases

I tried to make this as simple as possible with a short example.
We have two databases, one in MSSQLServer and other in Progress.
We have the user DTO as it follows that we shown in a UI table within a web application.
User
int, id
String, name
String, accountNumber
String, street
String, city
String, country
Now this DTO(Entity) is not stored only in one database, some information (fields) for the same user are stored in one database and some in the other database.
MSsql
Table user
int, id
String, name
String, accountNumber
Table userModel
int, id
String, street
String, city
String, country
As you can see the key is the only piece that link two tables in both databases, as I said before they are not in the same database and not using same database vendor.
We have a requirement for sorting the UI table for each column. Obviously we need to create user dto with the information coming from both databases.
Our proposal at this moment is if user want to apply sorting using street field, we run a query in the Progress database and obtain a page (using pagination) using this resultset and go directly to the MSSQLServer User table with those keys and run another query to extract the missing information and save it to our DTO and transfer it to the UI. With implies run a query in one database then other query based on the returned keys in the second database.
The order of the database could change depending in which column(field) the user wants to apply sorting.
Technically we will create a jparepository that acts as a facade and depending on the field make the process in the correct database.
My question is:
There is some kind of pattern that is commonly used in this scenarios, we are using spring, so probably spring have some out of the box features to support this requirement, will be great if this is possible using jparepositories (I have several doubts about it as we will use two different entitymanagers, one for each database).
Note: Move data from one database to another is not an option.

For this, you need to have separate DataSource/EntityManagerFactory/JpaRepository.
There is no out-of-the-box support for this architecture in the Spring framework, but you can easily hide the dual DataSource pair behind a Service layer. You can even configure JTA DataSources for ACID operations.

As you will always need to fetch data from both databases, why not populate local java User objects then sort these objects (using a comparator with the appropriate fields you want to sort on).
The advantage of sorting locally vs doing the sort in the database query is that you won't have to send requests to the database every time you change the sorting field.
So, to summarize:
1- Issue two sql queries for the two databases to get your users
2- Build your User objects using the retrieved values
3- Use Java comparators to sort the users on any field without having to issue new queries to the database.

My advice would be to find a way to link 2 databases together so that you can utilize database driver features without your code being affected.
Essentially if Progress database can be linked to SQL Server, you will be able to query both databases using a single SQL query with a join on id column and you will get a merged, sorted and paginated result set for your application to display.
I am not an expert in Progress database but it seems there is an ODBC driver for it so you might try to link it to SQL Server.

Splitting MySQL Database into separate databases

I have a requirement that the MySQL database being used in my application is scaling very aggressively. I am in no state currently to migrate to a NoSQL Database.
I have figured out the following areas where I can try splitting the current database into multiple databases:
There are some tables which have static content, i.e. it changes barely.
There are user tables which store the user data upon interaction which changes drastically.
Now, if i split the database into two different databases, how will I handle the transaction? How will I write the Data Access Layer, will i have connections to both the databases? The application currently uses Spring & Hibernate for Back End. There are calls which join the user tables and the content tables in the current schema.
The architecture follows the current structure:
Controller -> Service -> DAO Layer.
So, if i am willing to refactor the DAO layer which communicates with the database, what approach should i follow? I know only about Hibernate ORM but i would be willing to letting it go if there is something better than Hibernate.

Multiple databases on the same server? That approach will probably not improve performance on its own. RAM, fast disks, optimization, partitioning, and correct indexing will have a far greater payback.
If you have multiple databases on one server you can connect to them with a single connection, and simply use the database names with the table names in your SQL. Transactions work fine within a single connection.
Transactions across multiple connections and multiple servers are harder. There's a feature in MySQL called XA transactions to help handle this. But it has plenty of overhead, and is therefore most useful for high-value transactions as in banking.
In the jargon of the trade, adding servers is called "scale-out." The alternative is "scale-up," in which you add more RAM, faster direct-access storage, optimization, and other stuff to a single server to get it to do more.
There are several approaches you can take to the scale-out problem. The classic one is to use MySQL to set up a single primary server with multiple load-balanced replica servers.. That's probably the path that's most often taken, so you can do it without reinventing a lot of wheels. In this solution you do all your writing to a single instance. Queries that look up data can use multiple read-only load-balanced instances.
http://dev.mysql.com/doc/refman/5.5/en/replication-solutions-scaleout.html
This is a very popular approach where you have a mix of long-running reporting queries and short-running interactive queries. The reporting can be run on dedicated slave servers.
Another approach is multiple-primary-server replication using MySQL Cluster. https://dev.mysql.com/doc/refman/5.5/en/mysql-cluster-replication-multi-master.html
Another approach, if you have money to spend, is to go with a supported MySQL Cluster. Oracle, MariaDB, and Percona have such products on offer.
Scale-out is a big job no matter how you approach it. There's some documented experience from other people who have done it. For example, https://www.facebook.com/note.php?note_id=23844338919

It sounds like you did not thought about the partition of your database.
You should read something about database normalization first: database normalization
To split the database i would export the sql code from the database, then i would make 2 new files were i copy the tables that i want to have in the specific databases. After that i would import the 2 files in the specific databases.
i think this might help u help me: lets say i want to print reports for a user. the user is persisted in 'user' table and there is a score table which has the user score for every user_id. Now, my plan is to put the user table in one database, and score table in another database, making them two data sources. How can i handle such a scenario?
First to put the tables in different databases make no sence for me and i did not know if there is a ability to make select queries with to different databases mixed.
example: SELECT score, name FROM user, score WHERE score > 100 AND(score.user_id = user.user_id);
I dont no if this fit with two databases i think not.

Redis and DB communication

I am developing a photo album system and decided to use Redis. I keep the user's photo data, (who has which photos) in Redis. For example : photos:1000:pid [1,24,525,12,42,62,56] means the user with the id 1000 has the photos in the list (ids). The point that I confused is when I got the [1,24,525,12,42,62,56], how I can get the photo details ? I thought using Redis to get photo details again. However, when a user has 150 photo, getting them one by one (from java using jedis in a loop) costs 100 - 150 msec which is not suitable for my case. I have to manage a high traffic. Response shouldn't be over 100msec.
I decided to use DB by using stored procedures, "one shot, get everything" knowing the photo ids (they are indexed). Does "Get ids from Redis, get details from DB" is a proper approach ? What would you do for this situation ?

I would not recommend to use two different stores. Keep it simple. Think about the consistency of your data. If you are more familiar with a relational database, there is nothing wrong in using it (for all your data).
Now, if you want to store everything in Redis, it is also possible, provided you can anticipate all access paths to your data.
With Redis, running several commands to get some data is quite efficient if you bundle these commands in the same rountrip. Redis server (and most clients) fully supports pipelining. Assuming you use Jedis, you can find some examples here.
Actually, there are multiple ways to solve your problem.
Let's suppose you have the following model:
photos:<userid> -> set of photo IDs for a given user ID
photo:<photoid> -> hash of photo properties for a give photo ID
If you are interested by retrieving specific photo properties (say name and size) for a given user (i.e. like a select name, size from ...), it can be done using a single SORT command.
SORT photos:<userid> by nosort get # get photo:*->name photo:*->size
If you are interested by retrieving all the photo properties for a given user (i.e. like a select * from ...), it is a bit more complex.
One solution is to use pipelining and perform two roundtrips:
first roundtrip to get the set of photo IDs (using SMEMBERS)
second roundtrip to pipeline all the HGETALL commands (one per photo)
An alternative solution would be to use server-side Lua scripting to perform all the aggregation on server side. Complexity is higher, but the cost would be a single roundtrip.

How to filter a richfaces dataTable according to user role

I am working on a Seam project (Seam 2) with two types of user roles. Normal users, and users with sensitive information privileges. The latter have access to a set of database records marked "sensitive" that coexist with normal records, and are marked by a particular column value.
I have used #{s:hasRole('SENSITIVE')} to hide other portions of the UI as appropriate, but I would like to filter the actual richfaces dataTable in which the records are displayed, so that sensitive records do not appear for normal users. Is there a way to do this at the presentation layer, or do I need to filter the rows on the server based on user role?

Did you try the filter related properties in DataTable. please look at the properties their:
http://livedemo.exadel.com/richfaces-demo/richfaces/dataTable.jsf?tab=info&cid=147

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.