Upload multipart file to AWS without saving it locally

Upload multipart file to AWS without saving it locally - java

I wrote a Rest API that accepts MultipartFile. I want to upload the files that come in to Amazon S3. The problem is that I don't know a way other than first saving it to the local system before uploading it to S3. Is there any way to do so?
Right now, there is an issue in saving the file locally and I'm looking for a work around: Multipart transferTo looks for a wrong file address when using createTempFile

Yes, you can do this.Use putObject which consume InputStream as param.
Here is sample code.
public void saveFile(MultipartFile multipartFile) throws AmazonServiceException, SdkClientException, IOException {
ObjectMetadata data = new ObjectMetadata();
data.setContentType(multipartFile.getContentType());
data.setContentLength(multipartFile.getSize());
BasicAWSCredentials creds = new BasicAWSCredentials("accessKey", "secretKey");
AmazonS3 s3client = AmazonS3ClientBuilder.standard().withRegion(Regions.US_EAST_2).withCredentials(new AWSStaticCredentialsProvider(creds)).build();
PutObjectResult objectResult = s3client.putObject("myBucket", multipartFile.getOriginalFilename(), multipartFile.getInputStream(), data);
System.out.println(objectResult.getContentMd5()); //you can verify MD5
}
You can find javadoc here

Related

AWS S3 File uploading in Java - Why directly upload InputStream consume more memory than write it to temporary file, then upload?

I want to upload a file to S3 bucket using Streaming API of commons-fileupload library. This code is for parsing the request
FileItemIterator iterStream = upload.getItemIterator(request);
while (iterStream.hasNext()) {
FileItemStream item = iterStream.next();
String name = item.getFieldName();
InputStream stream = item.openStream();
if (!item.isFormField()) {
// Process the InputStream stream (*)
} else {
String formFieldValue = Streams.asString(stream);
}
}
This one is for initializing S3 client and transfer manager
s3Client = AmazonS3ClientBuilder.standard()
.withRegion(Regions.DEFAULT_REGION)
.withCredentials(new ProfileCredentialsProvider())
.build();
transferManager = TransferManagerBuilder.standard()
.withS3Client(s3Client)
.build();
I used a 100MB file to test. At the beginning, my springboot app started with about 95MB ram usage. When using that stream(*) to upload to s3 bucket, using
Upload upload = transferManager.upload(bucketName, key, inputStream, metadata );
is significantly more memory consumption (from 90MB->370MB) than copy stream (*) to OutputStream, then upload a file created from that outputStream (from 90MB-> 100MB)
try (
OutputStream out = new FileOutputStream(fileName);
) {
IOUtils.copy(inputStream, out);
}
PutObjectRequest request = new PutObjectRequest(
existingBucketName, fileName, new File(fileName));
Upload upload = transferManager.upload(request);
I wonder why is that. What happened to the inputStream that make it consume more memory to directly upload?
Thank you very much

If you don't know the size of the file, just use file upload. InputStream upload should be avoided. Refer to this,
When uploading options from a stream, callers must supply the size of options in the stream through the content length field in the ObjectMetadata parameter. If no content length is specified for the input stream, then TransferManager will attempt to buffer all the stream contents in memory and upload the options as a traditional, single part upload. Because the entire stream contents must be buffered in memory, this can be very expensive, and should be avoided whenever possible.
https://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/s3/transfer/TransferManager.html#upload-java.lang.String-java.lang.String-java.io.InputStream-com.amazonaws.services.s3.model.ObjectMetadata-

AWS S3 Using transferManager to make multipart upload by sending parts

I have recently learned about TransferManager class in AWS S3.
This behind the scenes creates multi-part upload, however it seems like I need to put whole file inside for it to work.
I receive my file in parts, so I need to create multipart upload manually. Is something like that possible using TransferManager? For example instead of
Upload upload = tm.upload(bucketName, keyName, new File(filePath));
to use for example something like
Upload upload = tm.upload(bucketName, keyName, partOfFile1);
Upload upload = tm.upload(bucketName, keyName, partOfFile2);
Upload upload = tm.upload(bucketName, keyName, partOfFile3);
Or am I stuck with AmazonS3 class when I need to upload file by parts manually?
Thanks for help!

How to get an excel file from AWS S3 bucket into a MultipartFile in Java

I've been trying to extract an .xlsx file from a AWS bucket I created and store it as a multipartfile variable. I've tried many different approaches, but at best I get weird characters. I'm not finding much documentation on how to do this.
Thanks!

// you may need to initialize this differently to get the correct authorization
final AmazonS3Client s3Client = AmazonS3ClientBuilder.defaultClient();
final S3Object object = s3Client.getObject("myBucket", "fileToDownload.xlsx");
// with Java 7 NIO
final Path filePath = Paths.get("localFile.xlsx");
Files.copy(object.getObjectContent(), filePath);
final File localFile = filePath.toFile();
// or Apache Commons IO
final File localFile = new File("localFile.xlsx");
FileUtils.copyToFile(object.getObjectContent(), localFile);
I'm not 100% sure what you mean by "MultipartFile" - that's usually in the context of a file that's been sent to your HTTP web service via a multipart POST or PUT. The file you're getting from S3 is technically part of the response to an HTTP GET request, but the Amazon Java Library abstracts this away for you, and just gives you the results as an InputStream.

S3 responses 403 right after upload

I use aws-java-sdk-bom in order to upload file:
final PutObjectRequest putRequest = new PutObjectRequest(bucketName, blobKey.toString(), input, metadata);
putRequest.setCannedAcl(CannedAccessControlList.PublicRead);
final ProgressTracker progress = new ProgressTracker();
transferManager.upload(putRequest, new S3ProgressListenerChain(progress));
and I noticed that sometimes if I try to access the URL right after the request is successfully done (mostly for big >20Mb files) it responses with 403.. After a second - everything is OK. Is there any timeout or something?

You should ref to the AWS S3 FAQ, I believe it takes little time to propogate...
http://docs.aws.amazon.com/AmazonS3/latest/dev/Introduction.html#ConsistencyModel

Image or Pdf File download from Gcloud Storage is corrupted

I am using file downloaded from Gcloud storage as attachment to Mandrill API for sending as an attachment in email. The problem is it's only working for Text file, but for Image or Pdf, the attachment is corrupted.
Following Code is for downloading the file and converting it to Base64 encoded String.
Storage.Objects.Get getObject = getService().objects().get(bucket, object);
ByteArrayOutputStream out = new ByteArrayOutputStream();
// If you're not in AppEngine, download the whole thing in one request, if possible.
getObject.getMediaHttpDownloader().setDirectDownloadEnabled(true);
getObject.executeMediaAndDownloadTo(out);
//log.info("Output: {}", out.toString("UTF-8"));
return Base64.encodeBase64URLSafeString(out.toString("UTF-8")
.getBytes(StandardCharsets.UTF_8));
I am setting this String in Content of MessageContent of Mandrill API.

Got it working. I only needed to store the OutputStream in a temp File before using it as attachment in email. Posting the code below for reference.
Storage.Objects.Get getObject = storage.objects().get("bucket", "object");
OutputStream out = new FileOutputStream("/tmp/object");
// If you're not in AppEngine, download the whole thing in one request, if possible.
getObject.getMediaHttpDownloader().setDirectDownloadEnabled(true);
getObject.executeMediaAndDownloadTo(out);

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Upload multipart file to AWS without saving it locally - java

Related

AWS S3 File uploading in Java - Why directly upload InputStream consume more memory than write it to temporary file, then upload?

AWS S3 Using transferManager to make multipart upload by sending parts

How to get an excel file from AWS S3 bucket into a MultipartFile in Java

S3 responses 403 right after upload

Image or Pdf File download from Gcloud Storage is corrupted

Categories

Resources