How load and save entire file from blobstore, using only BlobKey? - java

I want to service file (edit), that user uploaded to server. After saving it to blobstore, I want to load it to memory for editing. After storing file I got BlobKey in return. As I understand, I should use the following method to load it into memory:
byte[] BlobstoreService.fetchData(BlobKey blobKey, long startIndex, long endIndex)
The problem is that I do not know how big the file is, so I do not know what to pass as the endIndex variable. How, having only BlobKey, do I load a file from blobstore, change it, save new version and recive new BlobKey of changed file?

From the javadoc reference, seems like you can load a BlobInfo object that will contain your size. You just need to call
BlobInfoFactory() bif = New BlobInfoFactory();
BlobInfo bi = bif.loadBlobInfo(blobKey);
long size = bi.getSize();

Related

Most practical way to read an Azure Blob (PDF) in the Cloud?

I'm somewhat of a beginner and have never dealt with cloud-based solutions yet before.
My program uses the PDFBox library to extract data from PDFs and rename the file based on the data. It's all local currently, but eventually will need to be deployed as an Azure Function. The PDFs will be stored in an Azure Blob Container - the Azure Blob Storage trigger for Azure Functions is an important reason for this choice.
Of course I can download the blob locally and read it, but the program should run solely in the Cloud. I've tried reading the blobs directly using Java, but this resulted in gibberish data and wasn't compatible with PDFbox. My plan for now is to temp store the files elsewhere in the Cloud (e.g. OneDrive, Azure File Storage) and try opening them from there. However, this seems like it can quickly turn into an overly messy solution. My questions:
(1) Is there any way a blob can be opened as a File, rather than a CloudBlockBlob so this additional step isn't needed?
(2) If no, what would be a recommended temporary storage be in this case?
(3) Are there any alternative ways to approach this issue?
Since you are planning Azure function, you can use blob trigger/binding to get the bytes directly. Then you can use PDFBox PdfDocument load method to directly build the object PDDocument.load(content). You won't need any temporary storage to store the file to load that.
#FunctionName("blobprocessor")
public void run(
#BlobTrigger(name = "file",
dataType = "binary",
path = "myblob/{name}",
connection = "MyStorageAccountAppSetting") byte[] content,
#BindingName("name") String filename,
final ExecutionContext context
) {
context.getLogger().info("Name: " + filename + " Size: " + content.length + " bytes");
PDDocument doc = PDDocument.load(content);
// do your stuffs
}

How can I read a Base64 file that comes as a chain?

I am currently developing a REST service which receives in its request a field where it is passed a file in base 64 format ("n" characters come). What I do within the service logic is to convert that character string to a File to save it in a predetermined path.
But the problem is that when the file is too large (3MB) the service becomes slow and takes a long time to respond.
This is the code I am using:
String filename = "TEXT.DOCX"
BufferedOutputStream stream = null;
// THE FIELD base64file IS WHAT A STRING IN BASE FORMAT COMES FROM THE REQUEST 64
byte [] fileByteArray = java.util.Base64.getDecoder (). decode (base64file);
// VALID FILE SIZE
if ((1 * 1024 * 1024 <fileByteArray.length) {
    logger.info ("The file [" + filename + "] is too large");
} else {
    stream = new BufferedOutputStream (new FileOutputStream (new File ("C: \" + filename)));
    stream.write (fileByteArray);
}
How can I do to avoid this inconvenience. And that my service does not take so long to convert the file to File.
Buffering does not improve your performance here, as all you are trying to do is simply write the file as fast as possible. Generally it looks fine, change your code to directly use the FileOutputStream and see if it betters things:
try (FileOutputStream stream = new FileOutputStream(path)) {
stream.write(bytes);
}
Alternatively you could also try using something like Apache Commons to do the task for you:
FileUtils.writeByteArrayToFile(new File(path), bytes);
Try the following, also for large files.
Path outPath = Paths.get(filename);
try (InputStream in = Base64.getDecoder ().wrap(base64file)) {
Files.copy(in, outPath);
}
This keeps only a buffer in memory. Your code might become slow because of taking more memory.
wrap takes an InputStream which you should provide, not the entire String.
From Network point of view:
Both json and xml can support large amount of data exchange. And, 3MB is not really huge. But, there is a limitation on how much browser can handle (if this call is from a user interface).
Also, web server like Tomcat has property to handle 2MB by default (check maxPostSize http://tomcat.apache.org/tomcat-6.0-doc/config/http.html#Common_Attributes)
You can also try chunking the request payload (although it shouldn't be required for a 3MB file)
From Implementation point of view:
Write operation on your disk could be slow. It also depends on your OS.
If your file size is really large, you can use Java FileChannel class with ByteBuffer.
To know the cause of slowness (network delay or code), check the performance with a simple Java program against the web service call.

Java - when there is no need of writing to file - should I use input or output stream?

I'm getting from the client an inputStream and file Metadata, and saving it in my SQL table. This table also holds full file path and some unique uid.
I want to be able to pass a uid and get a "handler" to the file, but can't seem to understand if I need to return outputStream, InputStream or File?
Which one should be returned?
I want this handler for the client for the following reasons:
The user will pass it to another function
The user will decide to convert stream to a file and copy it to some local path
Also, When returning outputstram,is it enough to do the following:
OutputStream out = new FileOutputStream(PATH_TO_MY_FILE))
return out;
Am I returning an empty stream? Does out contain all file data?
I thought maybe the best way will be to return file:
File f = new File(PATH_TO_MY_FILE);
return f;
Editing:
My metadata holds file name and file type. When I get InputStream I save in in my folder and set the path in the SQL table to be : folerPath+"/"+filename + "."+ fileType
When The user will run the following function : get(fileUid) I want to retrieve the full path (by using sql query) and return the file (hanlder)
Can you please advise?
Thanks
The user will decide to convert stream to a file and copy it to some local path
This tells us that what you need to give them is an InputStream (or Reader), since they'll be reading from it.
Your code will be reading from your database or whatever, presumably via the InputStream you get back from ResultSet#getBinaryStream or similar. You might give that directly to the caller, or you may prefer to have your code in the middle, perhaps working through a memory buffer.
Re your comment below:
I'm saving the file at some DB folder...
Databases don't have folders; file systems have folders. It sounds like the file isn't stored in your database table, just the path to it. If so, use FileInputStream with the path to get an InputStream for it, which you can return to the caller.

Uploading a image to picasa, using a byte array

I have a byte[] of an image and I need to upload it as an image to picasa.
According to the documentation, an image is uploaded as follows.
MediaFileSource myMedia = new MediaFileSource(new File("lights.jpg"), "image/jpeg");
which means I need to create a File, out of the byte[].
The catch is, I have to do this without using FileOutputStream as it is not supported by Google App Engine (which is the environment I am using)
Is there any way to do this?
You don't have to use MediaFileSource to upload a photo, you can use MediaByteArraySource and pass it to photo.setMediaSource(...).

Jsp download file size

We are running tomcat, and we are generating pdf files on the fly. I do not have the file size before hand, so I cannot direcly link to a file on the server. So I directly send the output.
response.setContentType("application/force-download");
OutputStream o = response.getOutputStream();
And then I directly output to this OutputStream.
The only problem is that the receiver does not get the filesize, so they do not know how long the download will take. Is there a way to tell the response how large the file is?
EDIT
I do know the filesize, I just cant tell the STREAM how big the file is.
The response object should have a setContentLength method:
// Assumes response is a ServletResponse
response.setContentLength(sizeHere);
Serialize the PDF byte stream to a file or in a byte array calculate its size, set the size and write it to the output stream.
I beleive you're ansering the qustion your self:
quote:
I do not have the file size before hand, so I directly send the output.
If you don't have the size you cant send it....
Why not generate the PDF file in to temp file system , or ram-base file system or memory-map file on the fly. then you can get the file size.
response.setContentType("application/force-download");
response.setContentLength(sizeHere);
OutputStream o = response.getOutputStream();

Categories