Row-by-row writing into a file on amazon s3 - java

I need to write into a file on amazon s3.
When event A occurs, I need to get the object and write it to the end of the csv file on s3. When event B occurs I need to get the object and write it to the end of the same csv file on s3.
How can I do it?
P.S. Java
I tried to use outputstream but the one there isn't in c3 library "sofware.amazone.awssdk.services.s3"

Related

Reading from S3 location with dynamic addition of files through java client

I have an S3 bucket and files are being to the bucket by an ever running process. Now I want to write a consumer of these using java aws-s3-sdk. How to make sure, every file is read exactly once?

AWS File Upload to S3 with Java

I am trying to upload files into my S3 bucket using AWS Lambda in Java and i'm having some issues.
I am using APIGatewayProxyRequestEvent in my AWS Lambda function to get my file upload from Postman.
request.getBody() method of this event gives me a String representation of the image file whereas the S3.putObject takes as input an InputStream of the file to be uploaded.
How can I feed in request.getBody() to the S3.putObject() method in my Lambda code to make the File Upload work?
1) You may create a File and using FileWriter you may write the request.getBody() into it.
2) You can go with PutObjectRequest object and put file created in Step1 into it.
3) s3Client.putObject(PutObjectRequest) will help you to put object to s3

Play Framework file upload to in memory to S3

I am writing a web server with Play framework 2.6 in Java. I want to upload a file to WebServer through a multipart form and do some validations, then upload the file s3. The default implementation in play saves the file to a temporary file in the file system but I do no want to do that, I want to upload the file straight to AWS S3.
I looked into this tutorial, which explains how to save file the permanently in file system instead of using temporary file. To my knowledge I have to make a custom Accumulator or a Sink that saves the incoming ByteString(s) to a byte array? but I cannot find how to do so, can someone point me in the correct direction?
thanks

Can I use one S3 bucket for upload different java lambda function?

Currently I am using different S3 bucket for every function.
Ex. I have 3 Java Lambda Function created on Eclipse IDE.
RegisterUser
LoginUser
ResetPassword
I am uploading lambda function through Eclipse IDE,
I have to upload function through Amazon S3 Bucket.
I create 3 Amazon S3 Bucket for upload all 3 function.
My Question is : Can I upload all 3 Lambda Function using one Amazon S3 Bucket?
or
I have to create separate Amazon S3 Bucketfor all function.?
You don't need to upload to a bucket. You can upload the function code via the command line as well. They only recommend not using the web interface for large Lambda functions, all other methods are ok, and command line is a very good option.
However, if you really want to upload to a bucket first, just give each zip file that contains the function code a different filename and you're good.

How can I access a file's content from mappers in Amazon elastic map reduce?

If I am running an EMR job (in Java) on Amazon Web Services to process large amounts of data, is it possible to have every single mapper access a small file stored on S3? Note that the small file I am talking about is NOT the input to the mappers. Rather, the mappers need to process the input according to some rules in the small file. Maybe the large input file is a billion lines of text, for example, and I want to filter out words that are in a blacklist or something by reading a small file of blacklisted words stored in an S3 bucket. In this case, each mapper would process different parts of the input data, but they would all need to access the restricted words file on S3. How can I make the mappers do this in Java?
EDIT: I am not using the Hadoop framework, so there is no setup() or map() method calls. I am simply using the streaming EMR service and reading stdin line by line from input file.
You can access any S3 object within an mapper using S3 protocol directly. Eg. s3://mybucket/pat/to/file.txt
http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/emr-plan-file-systems.html .
You can actually use S3 to access your mapper's input files as well as any ad hoc lookup file as you are thinking to use. Previously these were differentiated by the use of s3n:// protocol for s3 object use and s3bfs:// for block storage. Now you dont have to differentiate and just use s3://
Alternatively, you can have an s3distcp step in the EMR cluster to copy the file - and make it available in hdfs. (this is not what you asked about but.. ) http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/UsingEMR_s3distcp.html

Categories