I have requirement where I am having some huge volume[Approx 20000+ records] of employee details and respective contact details to be downloaded in one spreadsheet. Current system is not capable to handle this much data and responding with internal server error. Since my application is build with Java + JAX RS so I am working with StreamingOutput and binding the data to one spreadsheet. What is the best to handle such kind of request? How can I break it into chunks and finally concatenate and create one single file.
Related
My requirement is to get polarion data and store that to our sql server database.
I went through Polarion SDK document and I feel webservice is the way to do that....
Which is the best way to read and store specific data from polarion to SQL server.
The webservice is very slow and depending on your size it will not be practicable to export your data with help of the webservice.
However the data in Polarion is stored in an SVN-repository in form of small .xml-files. So you can read these XML files directly from the repository.
As Polarion is not stored in a database compatible format, you need to setup your own DB-schema and the transform from the XML-files should be straight forward.
You can either checkout a complete Polarion Project or you retrieve the files on demand via http(s) second approach will be slightly slower again.
I have common question about architecture I should use for my specific problem.
I have .TSV file with some informations and my task is to create REST API app that will consume this .TSV file and there will be 3 REST API endpoints. Each endpoint will return JSON data I processed from .TSV file.
My question is: Should I crate some POST method that will upload the TSV file and I will save it eg to the session and do the logic with using the API Endpoints?
Or should I POST the content of TFS file as JSON in every request to the specific endpoint?
I dont know how to glue it all together.
There is no requirement fot the DB. The program will be tested just with numerous requests through the API and I dont know how to process or store the .TSV content in my app so one user could call all three endpoint sequentially above the same data without reuploading the TSV file.
It's better to upload the file and then do the processing on server. The file will upload in one request and it's better rather than send multiple request.
I believe the solution will depend on the size of the file. Storing the file in the memory can not be a good approach if the file is very large. And also, saving the file in a session may not be good, because if you need to scale your service in the future, you will not be able to do it. Even storing the file in a /tmp directory can also be a bad approach, because the solution continues to be not scalable.
It will be a good idea using a Storage Service like AWS S3 or Google Firebase or any other related. When you would call one of your three RESTs, your application will verify if that file was not yet processed, read that file, process anything you want and save the result to your S3 Bucket (If you don't want to save the processed files, you can use a retention policy on S3 to delete the file after X period of time).
And only after this, you will return the result. As you can see, this is a synchronous solution.
If the file processing need a lot of CPU and takes so long, you will need an asynchronous solution. So instead of processing the files directly when you call the REST API, you will have to create another application that will read that file from S3, process it and save it. All asynchronously. And your REST API would only get the file from S3 and return it.
I am currently architecting some integration services for a web application. External java applications produce a data feed which supplies data, the data is massaged as necessary and then inputted in to a sql server database. The data is managed here and used as the basis for wcf and http rest services which are accessed by web applications, mobile devices etc.
This is the current setup. I am at present changing this modifying this as we have some issues with the integration of the java system and sql server database. The main issue we have is the standard of the data required, it can be missing fields etc. The current integration is a comma separated file placed on an ftp server, the file picked up, the file processed, data massaged and data inserted in to the sql server. Where we are currently getting "burned" is that data is inserted in to the sql server database and the quality of the data is not up to the necessary standard and/or quality.
So this process is being changed and looking for options as to both modernize this and make the integration services more robust.
So I am looking for both suggestions and recommendations to improve the above?
Some options that spring to mind are:
Expose a wcf service that the java system calls, data gets passed to it via the SOAP protocol, data then validated in the service before inserting in to sql server
Format of the data supplied moves from common separated file to an xml file and the xml file gets validated against a schema before the data is massaged
Any other suggestions?
Neither of your solutions is going to solve your data quality problem at its source. I'd look more critically at the applications producing the data and put the validation there in addition to validating it before INSERT into the database. You want to validate prior to INSERT, because you should never trust clients. But clients ought to honor a contract when they send you data.
One advantage that the web service offers that the others don't is the possibility of real time INSERTs into the database. Let the source applications send their requests to this broker service. It validates requests and inserts them in real time. No more batch.
I have a webapp with an architecture I'm not thrilled with. In particular, I have a servlet that handles a very large file upload (via commons-fileupload), then processes the file, passing it to a service/repository layer.
What has been suggested to me is that I simply have my servlet upload the file, and a service on the backend do the processing. I like the idea, but I have no idea to go about it. I do not know JMS.
Other details:
- App is a GWT app split into the recommended client/server/shared subpackages, using an MVP architecture.
- Currently, I am only running in GWT hosted mode, but am planning to move to Tomcat in the very near future.
I'm perfectly willing to learn whatever I need to in order to get this working (in fact, that's the point of writing the app). I'm not expecting anyone to write code for me, but can someone point me in the right direction to get started?
There are many options for this scenario, but the simplest may be just copying the uploaded file to a known location on the file system, and have a background daemon monitor the location and process when it finds it.
#Jason, there are many ways to solve your problem.
i) Have dump you file data into Database with column type BLOB. and have a DB polling thread(after a particular time period) polls table for newly inserted file .
ii) Have dump file into file system and have a file montioring process.
Benefit of i) over ii) is that DB is centralized and fast resource where as file systems are genrally slow and non-centalized in nature.
So basically servlet would dump either to DB or file system. Now about who will process that dumped file:- a) It could be either montioring process as discussed above or b) you can use JMS which is asynchronous in nature what it means servlet would put a trigger event in queue which will asynchronously trigger new processing thread.
Well don't introduce JMS in your system unnecessarily if you are ok with monitoring process.
This sounds interesting and familiar to me :). We do it in the similar way.
We have our four projects, all four projects includes file upload and file processing (Image/Video/PDF/Docs) etc. So we created a single project to handle all file processing, it is something like below:
All four projects and File processor use Amazon S3/Our File Storage for file storage so file storage is shared among all five projects.
We make request to File Processor providing details in XML via http request which include file-path on S3/Stoarge, aws authentication details, file conversion/processing parameters. File Processor does processing and puts processed files on S3/Storage, constructs XML with processed files details and sends XML via response.
We use Spring Frameowrk and Tomcat.
Since this is foremost a learning exercise, you need to pick an easy to use JMS provider. This discussion suggested FFMQ just one year ago.
Since you are starting with a simple processor, you can keep it simple and use a JMS Queue.
In the simplest form, each message send by the servlet has to correspond to a single job. You can either put the entire payload of the upload in the message, or just send a filename as reference to the content in the message. These are details you can refactor later.
On the processor side, if you are using Java EE, you can use a MessageBean. If you are not, then I would suggest a 3 JVM solution -- one each for Tomcat, the JMS server, and the message processor. This article includes the basics of a message consuming client.
I have created a app that will have a large set of data in the form of XML files inside documents folder. The data size is so large and its growing data by day so planning to move it to SQLLite DB. Also, i want it to be moved to SQLLite DB for security purposes. I have around 1000 XML files currently, it may grow in future. My primary issue is i want all the data inside XML files to be moved into SQLLite DB using a Backend System(.Net Framework or Java) and can i push this complete Database into the iPhone using a Web Service. So that no XML parsing happens in iPhone. Because i heard XML parsing is resource intensive than reading from SQLLite DB inside iPhone. Whether this is a feasible solution or any better approach is available?
Don't transport the entire set of data each time. Have the iOS client request only the changes since it last synced, and have it update its local database. Processing multiple XML documents should be fine so long as the app can synchronize in the background while the user continues to use it.