data integration services between java system and sql server - java

I am currently architecting some integration services for a web application. External java applications produce a data feed which supplies data, the data is massaged as necessary and then inputted in to a sql server database. The data is managed here and used as the basis for wcf and http rest services which are accessed by web applications, mobile devices etc.
This is the current setup. I am at present changing this modifying this as we have some issues with the integration of the java system and sql server database. The main issue we have is the standard of the data required, it can be missing fields etc. The current integration is a comma separated file placed on an ftp server, the file picked up, the file processed, data massaged and data inserted in to the sql server. Where we are currently getting "burned" is that data is inserted in to the sql server database and the quality of the data is not up to the necessary standard and/or quality.
So this process is being changed and looking for options as to both modernize this and make the integration services more robust.
So I am looking for both suggestions and recommendations to improve the above?
Some options that spring to mind are:
Expose a wcf service that the java system calls, data gets passed to it via the SOAP protocol, data then validated in the service before inserting in to sql server
Format of the data supplied moves from common separated file to an xml file and the xml file gets validated against a schema before the data is massaged
Any other suggestions?

Neither of your solutions is going to solve your data quality problem at its source. I'd look more critically at the applications producing the data and put the validation there in addition to validating it before INSERT into the database. You want to validate prior to INSERT, because you should never trust clients. But clients ought to honor a contract when they send you data.
One advantage that the web service offers that the others don't is the possibility of real time INSERTs into the database. Let the source applications send their requests to this broker service. It validates requests and inserts them in real time. No more batch.

Related

Migrating to cloud and multi tenant DB

We have got a web based Java application which we are planning to migrate to cloud with an intention that multiple clients will be using it in a SaaS based environment. The current architecture of the application is quite asynchronous in nature. There are 4 different modules, each having a database of its own. When there is a need of data exchange between the modules we push the data using Pentaho and make use of a directory structure to store the interim data file, which is then picked up by the other module to populate its database. Given the nature of our application this asynchronous communication is very important for us.
Now we are facing a couple of challenges while migrating this application to cloud:
We are planning to use Multi Tenancy on our database server, but how do we ensure that the flat files we use for transferring the data between different modules are also channelized to their respective tenants in the DB.
Since we are planning to host this in cloud, would seek your views, if keeping a text file on a cloud server would be safe from a data security perspective.
File storage in cloud is safe and you can use control IAM roles setup to control the permissions of a file. Cloud providers like Google (Cloud storage), Amazon (AWS S3), etc provides a secure and scalable infrastructure to maintain files in the cloud.
In general setup, cloud storage provides you with buckets which are tagged with a global unique identification. For a multi-tenant setup you can create multiple buckets for individual tenants and store the necessary data feeds in it. Next, you can have jobs batch or streaming jobs using kettle (Pentaho) to push it to the right database based on the unique bucket definition.
Alternatively, you can also push (like other answers) to a streaming setup (like ActiveMQ, Kafka, etc) with user specific topics and have a streaming service (using java or pentaho) to ingest the data to respective database based on the topic.
Hope this helps :)
I cannot realistically give any specific advice without knowing more
about your system. However, based on my experience, I would
recommend switching to message queues, something like Kafka would
work nicely.
Yes, cloud providers offer enough security for static file storage. You can
limit access however you see fit, for example using AWS S3.
1- The multi tenancy may create a bit of issue while transferring the files. But from what information you have given the process of flat file movement across application will not be impacted. Still you can think of moving to MQ mode for passing the data across.
2-From data security view, AWS provides lot of features at access level, MFA, etc. If it needs to be highly secured i would recommend to get AWS Private cloud where nothing is shared with any one at any level.

How to best decouple data base from application?

We have a command and control system which persists historical data in a database. We'd like to make the system independent of the database. So if the database is there, great we will persist data there, if it is not, we will do some backup storage to files and memory until the database is back. The command and control functionality must be able to continue uninterrupted by the loss or restoration of the database; it should not even know the database exists. So the database and DAO functionality needs to be decoupled from the rest of the application.
We are using RESTful service calls, Spring framework, ActiveMQ, JDBCTemplate with SQL Server database. Currently following standard connection practices using Hikari datasource and JTDS driver. The problem is that if the database goes down or the database connection is lost we start to have data issues as too many service calls (mainly the getters) are still too dependent on the database existence for processing. This dependence is what we'd like to eliminate.
What are the best practices/technologies for totally decoupling the database from the application? We are considering using AMQ to broadcast data updates and have the DAO listen for those messages and then do the update to the database if it is available or flat files as a backup. Then for the getters, provide replies based on what is available either from the actual database or from the short-term backup.
My team has little experience with this and we want to know what others have done that works well.

how to fetch a real-time changing data into a local server?

we have to develop a local server which will load itself with the real-time data of a industry (particularly time stamped data points like the temperature of a boiler,pressure values etc) which are stored in industrial server and we want to fetch them and populate our server with it, the data is not streamed at server end so how to fetch it continuously and populate the server...
we would like to store only past 2-3 days of history data as time advances, any recommendations about the server and the back end process to be used to fetch data are welcome, we don't have any idea were to start..
please help...
As others have stated,
You need to provide more information on how do you intend to populate your server.
What API do you have for the "real time server"?
I worked on a management system for solar engery devices
(i.e - devices that produce electricity from solar energy - they are called photo-volatic cells if I remember correctly).
In my case these devices had an FTP access , which provided me files with time-based information.
I constructed a java server that used the following technologies:
A. Apache tomcat web container - This web container allowed me on one hand to hold java logic, and on the other hand to expose HTTP-based interface to the customer.
The Java logic was located in a Servlet- which exposes methods to handle HTTP requests (and allows writing returned data using response objects).
B. The servlet has an init method, I used it to perform some initialization, such as starting a quartz periodic task to probe the ftp servers of the devices.
C. I used a database (postgresql database, which is an open source database) to store configuration for the application, and also to store results.
D. I used another periodic task to archive old data in an archiving table, so the main data table will hold relatively new data.
I ran the archiving task once in a few days, and it simple checked for record that were "too" old, inserted them to the archiving table, and deleted them from the main data table. In order to peform this efficiently I have have decided to use a function that I coded on the database.
E. In order to access the database from the application, I used the Hibernate object relational mapping technology.
This technology allowed me to define mappings between tables and their relations to java objects, and gave me generated create,read (by-id), delete and updated SQL statements.
Using the HQL query language, I wrote some more complex queries.
F. For presentation/client side - I used plain JSP.
You may choose other alternatives such as :
GWT, Apache Wicket, JSF
You may consider using some MVC framework to have some seperation between the logic and the presentation. Such frameworks can be:
Spring-MVC , Struts, and many others.
To conclude, you must understand that Java offers you a variety of technologies, you must define requirements well, and then start investigating which technology can meet your needs.

Making a web service in PHP to give information to Java

Following my question regarding connecting to a MySQL database in Java, I am looking to create a web service in PHP. My Java program needs to ask the web service to gather some data from MySQL database and send the result back. However, I have a few dilema's:
Firstly, my web hosts do not support Java, and therefore the server side needs to be written in PHP but the client needs to be written in Java.
Secondly, all the tutorials I have found seem to involve creating a whole web service project in order for my Java program to communicate with the web service, where as realistically only a couple of classes need to contact the PHP web service.
And, you may have already guessed but I don't know anything about web service's. It was just suggested that I used one in order to get around the GPL licence of the JDBC driver...
I realise that similar questions may have been asked here before but as I am a complete novice, the posts that are saw here did not contain enough information for me and I require as much help as I can get - almost a step by step guide!
Alternatively, I did think about just using standard PHP Sockets, as I am pretty sure I know how to use them. However, I don't know how secure they are and I didn't want to take any risks because I will be needing to retrieve information such as licence keys!
Thanks in Advance
You don't need to use PHP Sockets, all you need is a simple PHP script on your web host that fetches the data you need from the MySQL DB and outputs the data to be read by your Java client.
Your PHP script will need:
To retrieve any query parameters from the Java client (probably
via $_POST or $_GET).
Information to connect to MySQL (hostname/ip address, db name,
username, password).
To run SQL query/queries to grab the data from the database.
To output the data for the java client to read, in some mutually-acceptable format, such as XML, JSON, HTML, etc.
You would structure the script something like this:
<?php
// 1. Read and validate input parameters
$myquery_val = $_POST['queryval'];
// 2. Connect to MySQL
// 3. Fetch MySQL data
// 4. Output data
?>
To learn how to connect to MySQL and retrieve data, read up on MySQL PDO: http://php.net/manual/en/ref.pdo-mysql.php
What I would do is use an agnostic form of communication between your PHP service and the Java client. My weapon of choice is XML.
The steps would be:
Create the PHP classes which will interact with your database and get the data you want to work with. GitHub has plenty of examples and source code. Sample PHP-MySQL Database Abstraction Layer
Create a RESTful php service which takes the data from step 1 and makes it into an XML REST service. Checkout the Recess Framework, an easy to use REST framework
Create your JAVA client, it should just need to be able to work with HTTP, and consume XML. No need for a huge soap or other framework.

Most Efficient Way of calling an external webservice in Java?

In one of our applications we need to call the Yahoo Soap Webservice to Get Weather and other related info.
I used the wsdl2java tool from axis1.4 and generated th required stubs and wrote a client. I use jsp's use bean to include the client bean and call methods defined in the client which call the yahoo webservice inturn.
Now the problem: When users make calls to the jsp the response time of the webservice differs greatly, like for one user it took less then 10 seconds and the other in the same network took more than a minute.
I was just wondering if Axis1.4 queues the requests even though the jsps are multithreaded.
And finally is there an efficient way of calling the webservice(Yahoo weather). Typically i get around 200 simultaneous requests from my users.
Why don't you schedule one thread to get the weather every minute or so, and expose that to the JSP, in stead of letting each JSP get its own weather report?
That's a lot more efficient for both you and Yahoo, and JSP's only need to lookup a local object (almost instantaneous) in stead of connecting to a web service.
EDIT
Some new requirements in the comments of this answer suggest a different way of choosing solutions.
It seems that not only weather, which not only doesn't change that often but is also the same for every user, is requested by web service but also other data like flight data.
The requirements for flight data retrieval are very much different than for weather data. So I think you should define a few types of (remote) data and choose a different solution
for each category.
As basis for the requirements I'd use something simple:
Users like their information promptly, they do not like waiting
The amount of data stored on the web server is finite
Remote web services have an EULA of sorts and are probably not happy with 200 concurrent requests of the same data by the same source (you)
Fast data access to users is best achieved by having the data locally, be it transient (kept in a bean) or persistent (a local database). That can be done by periodically requesting data from the remote source, and using the cached data in the JSP. That would also keep you in the clear with the third point.
A finite amount of data stored on the web service means that not everything can be cached. Data which differs per user, or large data sets which can vary over small periods of time, cannot readily be cached. It's not really a good idea to load data on all flights of all airports in the US every minute or so. That kind of requests would be better served by running a specific web service query when necessary.
The trick is now to identify when caching data is feasible. If it is feasible, do that, otherwise run the web service query in the background. That can be done by presenting the JSP now and starting the web service query in the background. The JSP can have an AJAX script which queries your web server whether the data is ready, and insert that data in the page when ready.
I'd use Google tools to monitor how long the call to the web service is taking.
There are several things going on here:
Map Java beans to XML request.
Send XML request to web service.
Unmarshall XML request on web service side.
Web service processes request
Web service marshalles XML response
Web service sends XML response to Java client
Unmarshall XML response and display on client.
You can't see inside the Yahoo web service, but do break out what you can see on the client side to see where the time is spent.
Check memory as well. If Axis is generating .class files, maybe your perm space is being consumed. Visual VM is available to you with the JDK. Attach it to the PID on your client to see what's going on in memory on your app server.
Maybe this would be a good place for an AJAX call. This will be a good solution if you can get the weather in the background while users are doing other things.
I would recommend local caching and data pooling. Instead of sending out 200 separate requests for similar/same locations run a background thread which pulls the weather for only the locations your users are interested in and caches them locally, this cache updates every minute or so. When users request their personal preferences, the requests hit the cache and refetch if the location is new or the data in the cache is stale. This way the user will have a more seamless experience and you will not hit Yahoo throttles and get denied service.

Categories