Rate Limit s3 Upload using Java API - java

I am using the java aws sdk to transfer large files to s3. Currently I am using the upload method of the TransferManager class to enable multi-part uploads. I am looking for a way to throttle the rate at which these files are transferred to ensure I don't disrupt other services running on this CentOS server. Is there something I am missing in the API, or some other way to achieve this?

Without support in the API for this, one approach is to wrap the s3 command with trickle.

Related

Amazon Kinesis Firehose to S3

I was asked to write a code to send a .csv file to S3 using Amazon Kinesis Firehose. But as someone who has never used Kinesis, I have no idea how I should do this. Can you help with this, or if you have a code that does this job, it can also help (Java or Scala).
csv data should be sent to Kinesis Firehose to be written to a S3 bucket in gzip format using a Firehose client application.
Thanks in advance.
Firstly, Firehose is streaming to send a record (or records) to a destination, not a file transfer such as copy a csv file to S3. You can use S3 CLI commands if you need to copy files from somewhere to S3.
So please first make sure what you need to do is streaming or file copy. If it is not streaming, then I wonder why Firehose.
There are multiple input source you can use. First better decide which way to use.
If you use JAVA+AWS SDK, then probably PutRecord API call would be the way
Writing to Kinesis Data Firehose Using the AWS SDK
aws-sdk-java/src/samples/AmazonKinesisFirehose/
Put data to Amazon Kinesis Firehose delivery stream using Spring Boot
If you can use AWS Amazon Linux to send the data to Firehose, Firehose Agent will be easier. It just monitor a file and can send the deltas to S3.
enter link description here

How to compress files on azure data lake store

I'm using Azure data lake store as a storage service for my Java app, sometimes I need to compress multiples files, what I do for now is I copy all files into the server compress them locally and then send the zip to azure, even though this is work it take a lot of time, so I'm wondering is there a way to compress files directly on azure, I checked the data-lake-store-SDK, but there's no such functionality.
Unfortunately, at the moment there is no option to do that sort of compression.
There is an open feature request HTTP compression support for Azure Storage Services (via Accept-Encoding/Content-Encoding fields) that discusses uploading compressed files to Azure Storage, but there is no estimation on when this feature might be released.
The only option for you is to implement such a mechanism on your own (using an Azure Function for example).
Hope it helps!

How to handle Blobs in Google App Engine alternatively to Cloud Storage and BlobStore API

The BlobStore API is marked as 'superseded' also limited to Limits to 32 MB.
The Google Cloud Storage is a vendor lock-in.
Is there a way to upload blobs with a 3rd part lib
In Google App Engine (not flexible / managed-vms) for example JClouds
And how would one bypass the 60 Seconds request limit that causes DeadlineExceededException?
To enhance the question;
Security is an issue, it would be preferably to run every request trough the application, so also blob uploads. Which makes the 60 seconds an issue.
The seperate uploadUrl is an option, but i do not wish to use BlobStore or Cloud Storage, but is there a generic way to handle things like this in GAE?
32MB is not a limitation of the BlobStore, but rather request playloads that go to your GAE app. You can upload larger files to both Cloud Storage and the BlobStore by creating a temporary URL for the user to submit the file to, which does not go through your ap, but rather goes directly to the storage service. You can find documentation about that for blobstore here. I don't personally use Cloud Storage, so I don't the a documentation link handy.
You can certainly use any other service in a similar way, but I'm afraid I can't explain how other than to say "consult their documentation". I know that's not a great answer to your question, but maybe insight into how it works with Google's products will help you understand how to use a 3rd party as well.
As for the 60 second request limit: since your upload requests cannot go through your server anyway, this is a non-issue. The 60 second limit only applies to requests made directly to your app.

Azure storage blob upload from url

Is there a way to do this?
I have plenty of files over few servers and Amazon s3 storage and need to upload to Azure from an app (Java / Ruby)
I prefer not to download these files on my app server and then upload it to Azure blob storage.
I've checked the Java and Ruby sdk, it seems there's no straight way to do this based on the examples (means I have to download these files first on my app server and upload it to Azure)
Update:
Just found out about CloudBlockBlob.startCopy() in the Java SDK.
Tried it and it's basically what I want without using Third party tools like AzCopy.
You have a few options, mostly licensed, but I think that AzureCopy is your best free alternative. You can check a step by step experience on MSDN Blogs.
All you need is your Access Keys for both services and with a simple command:
azurecopy -i https://mybucket.s3-us-west-2.amazonaws.com/ -o https://mystorage.blob.core.windows.net/mycontainer -azurekey %AzureAccountKey% -s3k %AWSAccessKeyID% -s3sk %AWSSecretAccessKeyID% -blobcopy -destblobtype block
You can pass blobs from one container to the other.
As #EmilyGerner said, AzCopy is the Microsoft offical tool, and AzureCopy that #MatiasQuaranta said is the third party tool on GitHub https://github.com/kpfaulkner/azurecopy.
The simple way is using AWS Command Line and AzCopy to copy all files from S3 to local directory to Azure Blob Storage. You can refer to my answer for the other thread Migrating from Amazon S3 to Azure Storage (Django web app). But it is only suitable for a small data size bucket.
The other effective way is programming with SDKs of Amazon S3 and Azure Blob Storage for Java. Per my experience, Azure SDK APIs for Java is similiar with C#‘s, so you can try to refer to Azure Blob Storage Getstarted doc for Java and AWS SDK for Java to follow #GauravMantri sample code to rewrite code in Java.

How to put object to S3 via CloudFront

I'd like to upload image to S3 via CloudFront.
If you see the document about CloudFront, you can find that cloud front offers put method for uploading to cloudFront
There could be someone to ask me why i use the cloud front for uploading to S3
If you search out about that, you can find the solution
What i wanna ask is whether there is method in SDK for uploading to cloud front or not
As you know , there is method "putObejct" for uploading directly to S3 but i can't find for uploading cloud front ...
please help me..
Data can be sent through Amazon CloudFront to the back-end "origin". This is used for using a POST on web forms, to send information back to web servers. It can also be used to POST data to Amazon S3.
If you would rather use an SDK to upload data to Amazon S3, there is no benefit in sending it "via CloudFront". Instead, use the Amazon S3 APIs to upload the data directly to S3.
So, bottom line:
If you're uploading from a web page that was initially served via CloudFront, send it through CloudFront to S3
If you're calling an API, call S3 directly
If the bucket's region is far away from the uploading computer you can upload faster by enabling S3 Accelerate which uploads directly through the Amazon server located closest to you and then continues sending the file from there to the bucket's actual region at an optimal route.
Have a look here.

Categories