I'm implementing an helper class to handle transfers from and to an AWS S3 storage from my web application.
In a first version of my class I was using directly a AmazonS3Client to handle upload and download, but now I discovered TransferManager and I'd like to refactor my code to use this.
The problem is that in my download method I return the stored file in form of byte[]. TransferManager instead has only methods that use File as download destination (for example download(GetObjectRequest getObjectRequest, File file)).
My previous code was like this:
GetObjectRequest getObjectRequest = new GetObjectRequest(bucket, key);
S3Object s3Object = amazonS3Client.getObject(getObjectRequest);
S3ObjectInputStream objectInputStream = s3Object.getObjectContent();
byte[] bytes = IOUtils.toByteArray(objectInputStream);
Is there a way to use TransferManager the same way or should I simply continue using an AmazonS3Client instance?
The TransferManager uses File objects to support things like file locking when downloading pieces in parallel. It's not possible to use an OutputStream directly. If your requirements are simple, like downloading small files from S3 one at a time, stick with getObject.
Otherwise, you can create a temporary file with File.createTempFile and read the contents into a byte array when the download is done.
Related
I am using Minio's JAVA SDK. I managed to copy objects within the same Minio Server. Is there a way to copy the objects from one Minio server to another?
I have tried using the below code:
InputStream inputStream = minioClientServer1.getObject(getBucket(), fileName);
minioClientServer2.putObject(getBucket(), fileName, inputStream, (long) inputStream.available(), null, null, contentType);
That is I got the object from one server and then uploaded to the next. The problem that I'm facing is that the contentType is unknown.
Is there a way to do this without hard coding the content type?
Or downloading the object to a file then uploading is a better way?
I'm not sure that this applies to you because you don't provide enough information for me to be sure we're talking about the same SDK. But from what you show, you should be able to call statObject in the same way that you call getObject and get an ObjectStat instance rather than an InputStream instance for a particular S3 object. Once you have an ObjectStat instance, you should be able to call the contentType method on it to get the content type of the S3 object.
This should work to do what you're asking:
ObjectStat objectStat = minioClientServer1.statObject(getBucket(), fileName);
InputStream inputStream = minioClientServer1.getObject(getBucket(), fileName);
minioClientServer2.putObject(getBucket(), fileName, inputStream, (long) inputStream.available(), null, null, objectStat.contentType());
In my app I'm generating large pdf/csv files. I'm wondering Is there any way to stream large files in Micronaut without keeping it fully in memory before sending to a client.
You can use StreamedFile, eg:
#Get
public StreamedFile download() {
InputStream inputStream = ...
return new StreamedFile(inputStream, "large.csv");
}
Be sure to check the official documentation about file transfers.
I have a file to store in mongoDB. What I want is to avoid loading the whole file (which could be several MBs in size) instead I want to open the stream and direct it to mongoDB to keep the write operation performant. I dont mind storing the content in base64 encoded byte[].
Afterwards I want to do the same at the time of reading the file i.e. not to load the whole file in memory, instead read it in a stream.
I am currently using hibernate-ogm with Vertx server but I am open to switch to a different api if it servers the cause efficiently.
I want to actually store a document with several fields and several attachments.
You can use GridFS. Especially when you need to store larger files (>16MB) this is the recommended method:
File f = new File("sample.zip");
GridFS gfs = new GridFS(db, "zips");
GridFSInputFile gfsFile = gfs.createFile(f);
gfsFile.setFilename(f.getName());
gfsFile.setId(id);
gfsFile.save();
Or in case you have an InputStream in:
GridFS gfs = new GridFS(db, "zips");
GridFSInputFile gfsFile = gfs.createFile(in);
gfsFile.setFilename("sample.zip");
gfsFile.setId(id);
gfsFile.save();
You can load a file using one of the GridFS.find methods:
GridFSDBFile gfsFile = gfs.findOne(id);
InputStream in = gfsFile.getInputStream();
I am a total newbie to amazon and java trying two things:
I am trying to create a folder in my Amazon S3 bucket that i have already created and have got the credentials for.
I am trying to upload a file to this bucket.
As per my understanding i can use putObjectRequest() method for acheiving both of my tasks.
PutObjectRequest(bucketName, keyName, file)
for uploading a file.
I am not sure if i should use this method
PutObjectRequest(String bucketName, String key, InputStream input,
ObjectMetadata metadata)
for just creating a folder. I am struggling with InputSteam and ObjectMetadata. I don't know what exactly is this for and how I can use it.
You do not need to create a folder in Amazon S3. In fact, folders do not exist!
Rather, the Key (filename) contains the full path and the object name.
For example, if a file called cat.jpg is in the animals folder, then the Key (filename) is: animals/cat.jpg
Simply Put an object with that Key and the folder is automatically created. (Actually, this isn't true because there are no folders, but it's a nice simple way to imagine the concept.)
As to which function to use... always use the simplest one that meets your needs. Therefore, just use PutObjectRequest(bucketName, keyName, file).
Yes, you can use PutObjectRequest(bucketName, keyName, file) to achive both task.
1, create S3 folder
With AWS S3 Java SDK , just add "/" at the end of the key name, it will create empty folder.
var folderKey = key + "/"; //end the key name with "/"
Sample code:
final InputStream im = new InputStream() {
#Override
public int read() throws IOException {
return -1;
}
};
final ObjectMetadata om = new ObjectMetadata();
om.setContentLength(0L);
PutObjectRequest putObjectRequest = new PutObjectRequest(bucketName, objectName, im, om);
s3.putObject(putObjectRequest);
2, Uploading file
Just similar, you can get input stream from your local file.
Alternatively you can use [minio client] java library
You can follow MakeBucket.java example to create a bucket & PutObject.java example to add an object.
Hope it help.
I'm trying to generate a PDF document from an uploaded ".docx" file using JODConverter.
The call to the method that generates the PDF is something like this :
File inputFile = new File("document.doc");
File outputFile = new File("document.pdf");
// connect to an OpenOffice.org instance running on port 8100
OpenOfficeConnection connection = new SocketOpenOfficeConnection(8100);
connection.connect();
// convert
DocumentConverter converter = new OpenOfficeDocumentConverter(connection);
converter.convert(inputFile, outputFile);
// close the connection
connection.disconnect();
I'm using apache commons FileUpload to handle uploading the docx file, from which I can get an InputStream object. I'm aware that Java.io.File is just an abstract reference to a file in the system.
I want to avoid the disk write (saving the InputStream to disk) and the disk read (reading the saved file in JODConverter).
Is there any way I can get a File object refering to an input stream? just any other way to avoid disk IO will also do!
EDIT: I don't care if this will end up using a lot of system memory. The application is going to be hosted on a LAN with very little to zero number of parallel users.
File-based conversions are faster than stream-based ones (provided by StreamOpenOfficeDocumentConverter) but they require the OpenOffice.org service to be running locally and have the correct permissions to the files.
Try the doc to avoid disk writting:
convert(java.io.InputStream inputStream, DocumentFormat inputFormat, java.io.OutputStream outputStream, DocumentFormat outputFormat)
There is no way to do it and make the code solid. For one, the .convert() method only takes two Files as arguments.
So, this would mean you'd have to extend File, which is possible in theory, but very fragile, as you are required to delve into the library code, which can change at any time and make your extended class non functional.
(well, there is a way to avoid disk writes if you use a RAM-backed filesystem and read/write from that filesystem, of course)
Chances are that commons fileupload has written the upload to the filesystem anyhow.
Check if your FileItem is an instance of DiskFileItem. If this is the case the write implementation of DiskFileItem willl try to move the file to the file object you pass. You are not causing any extra disk io then since the write already happened.