We have Java Vert.x project. We have implemented the download button in our Web Service for our users. When user clicks on download button, we would convert the huge data in our database that the user asked for, to excel file, then upload it in AWS S3 and then the URL from the S3 would be sent in response to the user's download request. But this whole method takes time (Especially generating the excel file). Everything is don in backend. Please suggest better approaches than this to implement download option in the page(User has both option to download filtered or complete data that he is accessible to).
Related
I have the most basic problem ever. The user wants to export some data which is around 20-70k records and can take from 20-40 seconds to execute and the file can be around 5-15MB.
Currently my code is as such:
User clicks a button which makes an API call to a Java Lambda
AWS Lambda Handler calls a method to get the data from DB and generate excel file using Apache POI
Set Response Headers and send the file as XLSX in the response body
I am now faced with two bottlenecks:
API Gateway times out after 29 seconds; if file takes longer to
generate it will not work and user get 504 in the browser
Response from lambda can only be 6MB, if file is bigger the user will
get 413/502 in the browser
What should be my approach to just download A GENERATED RUNTIME file (not pre-built in s3) using AWS?
If you want to keep it simple (no additional queues or async processing) this is what I'd recommend to overcome the two limitations you describe:
Use the new AWS Lambda Endpoints. Since that option doesn't use the AWS API Gateway, you shouldn't be restricted to the 29-sec timeout (not 100% sure about this).
Write the file to S3, then get a temporary presigned URL to the file and return a redirect (HTTP 302) to the client. This way you won't be restricted to the 6MB response size.
Here are the possible options for you.
Use Javascript skills to rescue. Accept the request from browser/client and immediately respond from server that your file preparation is in progress. Meanwhile continue preparing the file in the background (sperate job). Using java script, keep polling the status of file using separate request. Once the file is ready return it back.
Smarter front-end clients use web-sockets to solve such problems.
In case DB query is the culprit, cache the data on server side, if possible, for you.
When your script takes more than 30s to run on your server then you implement queues, you can get help from this tutorial on how to implement queues using SQS or any other service.
https://mikecroft.io/2018/04/09/use-aws-lambda-to-send-to-sqs.html
Once you implement queues your timeout issue will be solved because now you are fetching your big data records in the background thread on your server.
Once the excel file is ready in the background then you have to save it in your s3 bucket or hard disk on your server and create a downloadable link for your user.
Once the download link is created you will send that to your user via email. In this case, you should have your user email.
So the summary is Apply queue -> send a mail with the downloadable file.
Instead of some sophisticated solution (though that would be interesting).
Inventory. You will split the Excel in portions of say 10 k rows. Calculate the number of docs.
For every Excel generation called you have a reduced work load.
Whether e-mail, page with links, using a queue you decide.
The advantage is staying below e-mail limits, response time-outs, denial of service.
(In Excel one could also create a master document, but I have no experience.)
In my application, after clicking export icon, the report excel has been downloaded/generated in our local download folder in the system. I have tried to validate this action using service side (API)-postman. When I hit the export api (Application specific) , the response has not contain any path/file name that downloaded in the downloads folder. Is there any way that we can hit the downloads folder using api or any java programming way to get the downloaded excel.
I just want to read the excel data and compare with my input data using postman. First priority that I need to do it using api.
Select "Send and download" action in Postman. According to manual here:
If your API endpoint returns an image, Postman will detect and render
it automatically. For binary response types, you should select “Send
and download” which will let you save the response to your hard disk.
You can then view it using the appropriate viewer. This gives you the
flexibility to test audio files, PDFs, zip files, or anything that the
API throws at you.
I need to implemnt a AWS backend API that allows the users of my mobile app to upload a file (image) in Amazon S3.
Creating an API directly interfaced with the Amazon S3 is not an option because i will not be able to correlate the uploaded file to the record of the user on DynamoDB.
I've thought to create a Lambda function (Java) triggered by an API that performs the following steps:
1) calls the Amazon S3 functionality to upload the file
2) write the record into my Dynamo DB with the reference of the file
Is there a way to provide a binary file in input to my Lambda function exposed as API?
please let me know. thank you!
davide
The best way to do this is with presigned URLs. You can generate a URL that will let the user upload files directly to S3 with specific name and type. This way you don't have to worry about big files slowing down your server, lambda limits, or double charges for bandwidth. It's also faster for the user in most cases and supports S3 transfer acceleration.
The process can look something like:
User requests link from your server
Your server writes an entry in DynamoDB and returns a presigned URL
User uploads file directly to S3 using presigned URL (with exact name of your server's choice)
Once upload is done you either get a notification using Lambda, or just have the user tell your server the upload is done
Your server performs any required post-processing and marks the file as ready
And to answer your actual question, yes, there is a way to pass binary data to Lambda functions. The link is a step-by-step tutorial, but basically in API Gateway you have to set "Request body passthrough" to "When there are no templates defined (recommended)" and fill in your expected content types. Your mapping should include "base64data": "$input.body", and you need to setup your types under "Binary Support". In your actual lambda function, you should have access to the data as "base64data".
I have an android application, which wants the user to login each time he runs the app. So, the login procedure is simple, using the sqlite dabase file i'm using. I've copied the file in assets folder and doing the necessary modifications. But, the database file is of no use unless it is on the server. I don't have any server so i'm thinkin of keeping the database file on dropbox, google drive etc and then read or update that file as per user commands. The question is how to do that? I was searching the web for it, and found that the only way is downloading the db file modifying it and the uploading it back. Can anyone give me an example??
Doing that isn't possible unless you have a server.
Because, if you are using dropbox, first you'll have to make your file public in order to download it (Not recommended at all. Compromises security). Then you can use the url to download the file. But you won't be able to upload it back (Unless you are able to login to dropbox through your Android code).
Instead if you a web server with MySQL n PHP, you can easily send POST requests to your server.
I have a small linux vps. I have written a Java client application, which needs to connect and submit large string data and images. The string data will be stored as regular text files on the server, and will be parsed by another Java application that will run on the server and use this uploaded files and images.
The next part of the problem is, because this Java client will be run by several users, I need some way to uniquely identify each uploaded file to the currently logged in user session on the website (the user needs to login on the website, to be able to run the tasks). Any suggestions or more efficient patterns ?
Don't write the stuff to files. Punch the uploaded data into 'raw' database tables by user ID. The batch job job can pull the data out, parse/format/fold/spindle/mutilate, and stuff the results into the real tables, then delete the raw data.