I am working on a Java based web application.This application consumes REST API for its data via a custom made rest client which is a sub component of the application.Now I am stuck in a situation where I have to write code for downloading a large file.The rest client calls the api , obtains data stream from the Response and writes the file at a particular location at the server.
Should I read the file and write it to the output stream? Or is there any more efficient way to achieve this functionality?
Thanks in advance
Finally I am using the same approach of reading the large file and writing it to the output stream.The rest client sub component reads the file data from the rest api and writes it at a temporary location on the server.
After that I read that file and write it to the output stream.
Related
I have the most basic problem ever. The user wants to export some data which is around 20-70k records and can take from 20-40 seconds to execute and the file can be around 5-15MB.
Currently my code is as such:
User clicks a button which makes an API call to a Java Lambda
AWS Lambda Handler calls a method to get the data from DB and generate excel file using Apache POI
Set Response Headers and send the file as XLSX in the response body
I am now faced with two bottlenecks:
API Gateway times out after 29 seconds; if file takes longer to
generate it will not work and user get 504 in the browser
Response from lambda can only be 6MB, if file is bigger the user will
get 413/502 in the browser
What should be my approach to just download A GENERATED RUNTIME file (not pre-built in s3) using AWS?
If you want to keep it simple (no additional queues or async processing) this is what I'd recommend to overcome the two limitations you describe:
Use the new AWS Lambda Endpoints. Since that option doesn't use the AWS API Gateway, you shouldn't be restricted to the 29-sec timeout (not 100% sure about this).
Write the file to S3, then get a temporary presigned URL to the file and return a redirect (HTTP 302) to the client. This way you won't be restricted to the 6MB response size.
Here are the possible options for you.
Use Javascript skills to rescue. Accept the request from browser/client and immediately respond from server that your file preparation is in progress. Meanwhile continue preparing the file in the background (sperate job). Using java script, keep polling the status of file using separate request. Once the file is ready return it back.
Smarter front-end clients use web-sockets to solve such problems.
In case DB query is the culprit, cache the data on server side, if possible, for you.
When your script takes more than 30s to run on your server then you implement queues, you can get help from this tutorial on how to implement queues using SQS or any other service.
https://mikecroft.io/2018/04/09/use-aws-lambda-to-send-to-sqs.html
Once you implement queues your timeout issue will be solved because now you are fetching your big data records in the background thread on your server.
Once the excel file is ready in the background then you have to save it in your s3 bucket or hard disk on your server and create a downloadable link for your user.
Once the download link is created you will send that to your user via email. In this case, you should have your user email.
So the summary is Apply queue -> send a mail with the downloadable file.
Instead of some sophisticated solution (though that would be interesting).
Inventory. You will split the Excel in portions of say 10 k rows. Calculate the number of docs.
For every Excel generation called you have a reduced work load.
Whether e-mail, page with links, using a queue you decide.
The advantage is staying below e-mail limits, response time-outs, denial of service.
(In Excel one could also create a master document, but I have no experience.)
I am writing a web server with Play framework 2.6 in Java. I want to upload a file to WebServer through a multipart form and do some validations, then upload the file s3. The default implementation in play saves the file to a temporary file in the file system but I do no want to do that, I want to upload the file straight to AWS S3.
I looked into this tutorial, which explains how to save file the permanently in file system instead of using temporary file. To my knowledge I have to make a custom Accumulator or a Sink that saves the incoming ByteString(s) to a byte array? but I cannot find how to do so, can someone point me in the correct direction?
thanks
I am trying to use rest APIs provided by mule management console to retrieve server log files.
http://www.mulesoft.org/documentation/display/current/Servers
My intention is to use this List File API
http://localhost:8080/mmc-console-3.4.0/api/servers/{serverId}/files/{relativePathToFile}[?metadata=true]
provided and display the logs in UI.
1) What should be the return type of the method I make as the above API call is returning a file? Would it be 'File' ?
2) Since the size of the mule_ee.log file could be large, I want to send the entire file for first call and from next call I just want to send the few lines appended at the last so that UI will do the appending and show it in console. Is this feasible ? Is there a better approach to do this ?
As per documentation you will get the file itself. However there is no incremental mechanism.
For this matters you should probably use rsync or an advanced log distribution system.
I have been messing around with GWT uploads lately. I want to have the ability to upload an XML file from the client, and get the contents of that file (display the contents in a TextArea).
From what I have found on the net, it seems I would have to upload the file to the server, and then get the contents of the file. I do not particularly like the idea of allowing file uploads to the server (even if they are only XML). Is there anyway to pull the contents of a file that the client specifies without sending it to the server?
Thanks
Recent (decent?) browsers implement the "HTML5" File API that's quite easy to use in GWT using JSNI.
See also: https://developer.mozilla.org/en/Using_files_from_web_applications
Because of security restrictions you cannot access the file on the client side alone. It has to be sent to the server for processing.
Users of my web application have an option to start a process that generates a CSV file (populated by some data from a database) and uploads it to an FTP server (and another department will read the file from there). I'm just trying to figure out how to best implement this. I use commons net ftp functionality. It offers two ways to upload data to the FTP server:
storeFile(String remote, InputStream local)
storeFileStream(String remote)
It can take a while to generate all the CSV data so I think keeping a connection open the whole time (storeFileStream) would not be the best way. That's why I want to generate a temporary file, populate it and only then transfer it.
What is the best way to generate a temporary file in a webapp? Is it safe and recommended to use File.createTempFile?
As long as you don't create thousands of CSV files concurrently the upload-time doesn't matter from my point of view. Databases usually output the data row by row and if this is already the format you need for the CSV file I strongly recommend not to use temporary files at all - just do the conversion on-the-fly:
Create an InputStream implementation that reads the database data row by row, converts it to CSV and publish the data via it's read() methods.
BTW: You mentioned that the conversion is done by a web application and that it can take a long time - this can be problematic as the default web client has a timeout. Therefore the long lasting process should be better done by a background thread only triggered by the webapp interface.
It is ok to use createTempFile, new File(tmpDir, UUID.randomUUID().toString()) can do as well. Just do not use deleteOnExit(), it is a leak master. Make sure you delete the file on your right own.
Edit: since you WILL have the data in memory, do not store it anywhere; wrap a java.io.ByteArrayInputSteam and use the method w/ the InputStream. Much neater and better solution.