Is there a way to do this?
I have plenty of files over few servers and Amazon s3 storage and need to upload to Azure from an app (Java / Ruby)
I prefer not to download these files on my app server and then upload it to Azure blob storage.
I've checked the Java and Ruby sdk, it seems there's no straight way to do this based on the examples (means I have to download these files first on my app server and upload it to Azure)
Update:
Just found out about CloudBlockBlob.startCopy() in the Java SDK.
Tried it and it's basically what I want without using Third party tools like AzCopy.
You have a few options, mostly licensed, but I think that AzureCopy is your best free alternative. You can check a step by step experience on MSDN Blogs.
All you need is your Access Keys for both services and with a simple command:
azurecopy -i https://mybucket.s3-us-west-2.amazonaws.com/ -o https://mystorage.blob.core.windows.net/mycontainer -azurekey %AzureAccountKey% -s3k %AWSAccessKeyID% -s3sk %AWSSecretAccessKeyID% -blobcopy -destblobtype block
You can pass blobs from one container to the other.
As #EmilyGerner said, AzCopy is the Microsoft offical tool, and AzureCopy that #MatiasQuaranta said is the third party tool on GitHub https://github.com/kpfaulkner/azurecopy.
The simple way is using AWS Command Line and AzCopy to copy all files from S3 to local directory to Azure Blob Storage. You can refer to my answer for the other thread Migrating from Amazon S3 to Azure Storage (Django web app). But it is only suitable for a small data size bucket.
The other effective way is programming with SDKs of Amazon S3 and Azure Blob Storage for Java. Per my experience, Azure SDK APIs for Java is similiar with C#‘s, so you can try to refer to Azure Blob Storage Getstarted doc for Java and AWS SDK for Java to follow #GauravMantri sample code to rewrite code in Java.
Related
I need to upload a file using a web form to AWS and then trigger a function to import it into a Postgres DB. I have the file import to a DB working locally using Java, but need it to work in the cloud
It needs a file upload with some settings (such as which table to import into) to be passed through a Java function which imports it to the Postgres DB
I can upload files to an EC2 instance with php, but then need to trigger a lambda function on that file. My research suggests S3 buckets are perhaps a better solution? Looking for some pointers to which services could be best suited
There are two main steps in your scenario:
Step 1: Upload a file to Amazon S3
It is simple to create an HTML form that uploads data directly to an Amazon S3 bucket.
However, it is typically unwise to allow anyone on the Internet to use the form, since they might upload any number and type of files. Typically, you will want your back-end to confirm that they are entitled to upload the file. Your back-end can then Upload objects using presigned URLs - Amazon Simple Storage Service, which authorize the user to perform the upload.
For some examples in various coding languages, see:
Direct uploads to AWS S3 from the browser (crazy performance boost)
File Uploads Directly to S3 From the Browser
Amazon S3 direct file upload from client browser - private key disclosure
Uploading to Amazon S3 directly from a web or mobile application | AWS Compute Blog
Step 2: Load the data into the database
When the object is created in the Amazon S3 bucket, you can configure S3 to trigger an AWS Lambda function, which can be written in the programming language of your choice.
The Bucket and Filename (Key) of the object will be passed into the Lambda function via the event parameter. The Lambda function can then:
Read the object from S3
Connect to the database
Insert the data into the desired table
It is your job to code this functionality but you will find many examples on the Internet.
You can use AWS SDK in your convenient language to invoke Lambda.
Please refer this documentation
I'm using Azure data lake store as a storage service for my Java app, sometimes I need to compress multiples files, what I do for now is I copy all files into the server compress them locally and then send the zip to azure, even though this is work it take a lot of time, so I'm wondering is there a way to compress files directly on azure, I checked the data-lake-store-SDK, but there's no such functionality.
Unfortunately, at the moment there is no option to do that sort of compression.
There is an open feature request HTTP compression support for Azure Storage Services (via Accept-Encoding/Content-Encoding fields) that discusses uploading compressed files to Azure Storage, but there is no estimation on when this feature might be released.
The only option for you is to implement such a mechanism on your own (using an Azure Function for example).
Hope it helps!
Since 2 days Googling, I got to upload a file in Google Cloud Storage using java. Now I am facing troubles to download the same file from Google Cloud Storage using java.
I tried with BlobstoreService to upload a file. Can any body give me the suggestions to download from the GCS?
If you want to read a file on Google Cloud Storage from an App Engine application, you need to use the Google Cloud Storage Java Client Library, or you can read it using Blobstore API after you get the blobstore key using the function createGsBlobKey.
Using Google Cloud Storage Java Client Library in order to read/write files is fairly simple. Check out this page for more info:
https://developers.google.com/appengine/docs/java/googlecloudstorageclient/getstarted
Are there any open source cloud storage build with GAE/J?
I've found lots of open source cloud storage:
Ceph S3 http://ceph.com/docs/master/radosgw/s3/
Riak CS http://basho.com/riak-cloud-storage/
However, I can't find anything for GAE, are there any even those that doesn't easily pop out of a Google search?
EDIT: I was looking for server implementation of a cloud storage, not really a client.
Anyway a new project is starting that does cloud storage: https://code.google.com/p/basket-stack/
I am using the java aws sdk to transfer large files to s3. Currently I am using the upload method of the TransferManager class to enable multi-part uploads. I am looking for a way to throttle the rate at which these files are transferred to ensure I don't disrupt other services running on this CentOS server. Is there something I am missing in the API, or some other way to achieve this?
Without support in the API for this, one approach is to wrap the s3 command with trickle.