File from InputStream - java

Yes, this question has been asked a millions times, and I believe I've looked at them all. They are very "sometimesy", slow, or not what I need.
On one project, I use the following code to use the InputStream received from a GET to turn that into a PDF. This works PERFECTLY, every time, on my physical device and my emulator (Genymotion 2.1.1, Emulator API 18 4.3). Note that some things are edited out, and the PDFs are generally small, less than 1 MB.
public abstract class MyPDFFile extends File implements ApiModel{
public MyPDFFile(InputStream inputStream){
super(context.getExternalFilesDir(
Environment.DIRECTORY_DOWNLOADS), "my_pdf.pdf");
if (externalStorageIsWritable()) {
try {
BufferedReader bufferedReader = new BufferedReader(new InputStreamReader(inputStream));
FileOutputStream fileInputStream = new FileOutputStream(this);
BufferedWriter bufferedWriter = new BufferedWriter(new OutputStreamWriter(fileInputStream));
int readLine;
char[] cbuf = new char[1];
do {
readLine = bufferedReader.read(cbuf);
bufferedWriter.write(cbuf);
} while (readLine != -1);
bufferedWriter.close();
}
catch(IOException e){
// Didn't work
}
}
else{
// Cant write
}
}
I figured on this new project, I could use the same code to download an APK from the internet to the device. Nope, definitely not the case. I eventually tried this for Inputstream to File:
FileOutputStream fileOutputStream = new FileOutputStream(file);
byte[] buffer = new byte[1];
while ( (read(buffer)) > 0 ) {
fileOutputStream.write(buffer);
}
fileOutputStream.close();
close();
That works on my emulator, and works fine. I moved to testing on my device... not so much, which is weird, because the working PDF code works on both my emulator and device. I've tried adjusting the size of my buffer to various multiples of 512 (which results in the file being EXTREMELY small, like a few KB, to being EXTREMELY large, about double the apk size, which is about 5.6 MB).
Also, another weird thing: I can NEVER get it to successfully save outside of the constructor. When I do the saving there, the InputStream is fine, my file gets created, yadayada, and when I use successful code, I just rename the file afterwards since all I have access to in the constructor is the InputStream. If I decide "No, I want to name it when I have the proper things" and simply save the InputStream to my object, it NEVER works properly. Can never get above 4KB for the downloaded file. I've tried extends InputStream and extends BufferedInputStream to no avail.
I can post more code if needed. All I would have access to is the InputStream from my GET request; I'm using the browep Android HTTP library and that's all I can get without trying to mess with the library itself (or overriding methods in it).

The problem is that you're reading the file byte by byte. This can take ton of time. Instead, read the file in bigger piece of chunks, like 4 or 8 KBs:
int file_chunk_size = 1024 * 4; //4KBs, written like this to easily change it to 8
byte[] buffer = new byte[file_chunk_size];
int bytesRead = 0;
while ( (bytesRead = read(buffer)) > 0 ) {
fileOutputStream.write(buffer, 0, bytesRead);
}

Related

Copied DocumentFile has different siize and hash to original

I'm attempting to copy / duplicate a DocumentFile in an Android application, but upon inspecting the created duplicate, it does not appear to be exactly the same as the original (which is causing a problem, because I need to do an MD5 check on both files the next time a copy is called, so as to avoid overwriting the same files).
The process is as follows:
User selects a file from a ACTION_OPEN_DOCUMENT_TREE
Source file's type is obtained
New DocumentFile in target location is initialised
Contents of first file is duplicated into second file
The initial stages are done with the following code:
// Get the source file's type
String sourceFileType = MimeTypeMap.getSingleton().getExtensionFromMimeType(contextRef.getContentResolver().getType(file.getUri()));
// Create the new (empty) file
DocumentFile newFile = targetLocation.createFile(sourceFileType, file.getName());
// Copy the file
CopyBufferedFile(new BufferedInputStream(contextRef.getContentResolver().openInputStream(file.getUri())), new BufferedOutputStream(contextRef.getContentResolver().openOutputStream(newFile.getUri())));
The main copy process is done using the following snippet:
void CopyBufferedFile(BufferedInputStream bufferedInputStream, BufferedOutputStream bufferedOutputStream)
{
// Duplicate the contents of the temporary local File to the DocumentFile
try
{
byte[] buf = new byte[1024];
bufferedInputStream.read(buf);
do
{
bufferedOutputStream.write(buf);
}
while(bufferedInputStream.read(buf) != -1);
}
catch (IOException e)
{
e.printStackTrace();
}
finally
{
try
{
if (bufferedInputStream != null) bufferedInputStream.close();
if (bufferedOutputStream != null) bufferedOutputStream.close();
}
catch (IOException e)
{
e.printStackTrace();
}
}
}
The problem that I'm facing, is that although the file copies successfully and is usable (it's a picture of a cat, and it's still a picture of a cat in the destination), it is slightly different.
The file size has changed from 2261840 to 2262016 (+176)
The MD5 hash has changed completely
Is there something wrong with my copying code that is causing the file to change slightly?
Thanks in advance.
Your copying code is incorrect. It is assuming (incorrectly) that each call to read will either return buffer.length bytes or return -1.
What you should do is capture the number of bytes read in a variable each time, and then write exactly that number of bytes. Your code for closing the streams is verbose and (in theory1) buggy as well.
Here is a rewrite that addresses both of those issues, and some others as well.
void copyBufferedFile(BufferedInputStream bufferedInputStream,
BufferedOutputStream bufferedOutputStream)
throws IOException
{
try (BufferedInputStream in = bufferedInputStream;
BufferedOutputStream out = bufferedOutputStream)
{
byte[] buf = new byte[1024];
int nosRead;
while ((nosRead = in.read(buf)) != -1) // read this carefully ...
{
out.write(buf, 0, nosRead);
}
}
}
As you can see, I have gotten rid of the bogus "catch and squash exception" handlers, and fixed the resource leak using Java 7+ try with resources.
There are still a couple of issues:
It is better for the copy function to take file name strings (or File or Path objects) as parameters and be responsible for opening the streams.
Given that you are doing block reads and writes, there is little value in using buffered streams. (Indeed, it might conceivably be making the I/O slower.) It would be better to use plain streams and make the buffer the same size as the default buffer size used by the Buffered* classes .... or larger.
If you are really concerned about performance, try using transferFrom as described here:
https://www.journaldev.com/861/java-copy-file
1 - In theory, if the bufferedInputStream.close() throws an exception, the bufferedOutputStream.close() call will be skipped. In practice, it is unlikely that closing an input stream will throw an exception. But either way, the try with resource approach will deals with this correctly, and far more concisely.

Java 7 Deflating Files

I have a piece of code which uses the deflate algorithm to compress a file:
public static File compressOld(File rawFile) throws IOException
{
File compressed = new File(rawFile.getCanonicalPath().split("\\.")[0]
+ "_compressed." + rawFile.getName().split("\\.")[1]);
InputStream inputStream = new FileInputStream(rawFile);
OutputStream compressedWriter = new DeflaterOutputStream(new FileOutputStream(compressed));
byte[] buffer = new byte[1000];
int length;
while ((length = inputStream.read(buffer)) > 0)
{
compressedWriter.write(buffer, 0, length);
}
inputStream.close();
compressedWriter.close();
return compressed;
}
However, I'm not happy with the OutputStream copying loop since it's the "outdated" way of writing to streams. Instead, I want to use a Java 7 API method such as Files.copy:
public static File compressNew(File rawFile) throws IOException
{
File compressed = new File(rawFile.getCanonicalPath().split("\\.")[0]
+ "_compressed." + rawFile.getName().split("\\.")[1]);
OutputStream compressedWriter = new DeflaterOutputStream(new FileOutputStream(compressed));
Files.copy(compressed.toPath(), compressedWriter);
compressedWriter.close();
return compressed;
}
The latter method however does not work correctly, the compressed file is messed up and only a few bytes are copied. How come?
I see mainly two problems.
You copy from the target instead of the source. I think the copying has to be changed to Files.copy(rawFile.toPath(), compressedWriter);.
The Javadoc of copy says: "Note that if the given output stream is Flushable then its flush method may need to invoked after this method completes so as to flush any buffered output." So, you have to call the flush-method of the OutputStream after copy.
Additionally there is one more point. The Javadoc of copy says:
It is strongly recommended that the output stream be promptly closed if an I/O error occurs.
You can close the OutputStream in a finally-block to make sure it happens in case of an error. Another possibility is to use try with resources that was introduced in Java 7.

Which one is better approach so as to able to use wrapper class read() / write() method with android.content.res.Resources.openRawResource() method?

//Reading a image file from #drawable res folder and writing to a file on external sd card
//below one works no doubt but I want to imrpove it:
OutputStream os = new FileOutputStream(file); //File file.........
InputStream is =getResources().openRawResource(R.drawable.an_image);
byte[] b = new byte[is.available()];
is.read(b);
os.write(b);
is.close();
os.close();
In above code I am using basic io classes to read and write. My question is what can I do in order to able to use wrapper classes like say DataInputStream/ BufferedReaderd or PrintStream / BufferedWriter /PrintWriter.
As openRawResources(int id ) returns InputStream ;
to read a file from res I either need to typecast like this:
DataInputStream is = (DataInputStream) getResources().openRawResource(R.drawble.an_image));
or I can link the stream directly like this:
DataInputStream is = new DataInputStream(getResources().openRawResource(R.drawable.greenball));
and then I may do this to write it to a file on sd card:
PrintStream ps =new PrintStream (new FileOutputStream(file));
while(s=is.readLine()!=null){
ps.print(s);
}
So is that correct approach ? which one is better? Is there a better way?better practice..convention?
Thanks!!!
If openRawResource() is documented to return an InputStream then you cannot rely on that result to be any more specific kind of InputStream, and in particular, you cannot rely on it to be a DataInputStream. Casting does not change that; it just gives you the chance to experience interesting and exciting exceptions. If you want a DataInputStream wrapping the the result of openRawResource() then you must obtain it via the DataInputStream constructor. Similarly for any other wrapper stream.
HOWEVER, do note that DataInputStream likely is not the class you want. It is appropriate for reading back data that were originally written via a DataOutputStream, but it is inappropriate (or at least offers no advantages over any other InputStream) for reading general data.
Furthermore, your use of InputStream.available() is incorrect. That method returns the number of bytes that can currently be read from the stream without blocking, which has only a weak relationship with the total number of bytes that could be read from the stream before it is exhausted (if indeed it ever is).
Moreover, your code is also on shaky ground where it assumes that InputStream.read(byte[]) will read enough bytes to fill the array. It probably will, since that many bytes were reported available, but that's not guaranteed. To copy from one stream to another, you should instead use code along these lines:
private final static int BUFFER_SIZE = 2048;
void copyStream(InputStream in, OutputStream out) throws IOException {
byte[] buffer = new byte[BUFFER_SIZE];
int nread;
while ( (nread = in.read(buffer) != 0 ) do {
out.write(buffer, 0, nread);
}
}

Out of memory when encoding file to base64

Using Base64 from Apache commons
public byte[] encode(File file) throws FileNotFoundException, IOException {
byte[] encoded;
try (FileInputStream fin = new FileInputStream(file)) {
byte fileContent[] = new byte[(int) file.length()];
fin.read(fileContent);
encoded = Base64.encodeBase64(fileContent);
}
return encoded;
}
Exception in thread "AWT-EventQueue-0" java.lang.OutOfMemoryError: Java heap space
at org.apache.commons.codec.binary.BaseNCodec.encode(BaseNCodec.java:342)
at org.apache.commons.codec.binary.Base64.encodeBase64(Base64.java:657)
at org.apache.commons.codec.binary.Base64.encodeBase64(Base64.java:622)
at org.apache.commons.codec.binary.Base64.encodeBase64(Base64.java:604)
I'm making small app for mobile device.
You cannot just load the whole file into memory, like here:
byte fileContent[] = new byte[(int) file.length()];
fin.read(fileContent);
Instead load the file chunk by chunk and encode it in parts. Base64 is a simple encoding, it is enough to load 3 bytes and encode them at a time (this will produce 4 bytes after encoding). For performance reasons consider loading multiples of 3 bytes, e.g. 3000 bytes - should be just fine. Also consider buffering input file.
An example:
byte fileContent[] = new byte[3000];
try (FileInputStream fin = new FileInputStream(file)) {
while(fin.read(fileContent) >= 0) {
Base64.encodeBase64(fileContent);
}
}
Note that you cannot simply append results of Base64.encodeBase64() to encoded bbyte array. Actually, it is not loading the file but encoding it to Base64 causing the out-of-memory problem. This is understandable because Base64 version is bigger (and you already have a file occupying a lot of memory).
Consider changing your method to:
public void encode(File file, OutputStream base64OutputStream)
and sending Base64-encoded data directly to the base64OutputStream rather than returning it.
UPDATE: Thanks to #StephenC I developed much easier version:
public void encode(File file, OutputStream base64OutputStream) {
InputStream is = new FileInputStream(file);
OutputStream out = new Base64OutputStream(base64OutputStream)
IOUtils.copy(is, out);
is.close();
out.close();
}
It uses Base64OutputStream that translates input to Base64 on-the-fly and IOUtils class from Apache Commons IO.
Note: you must close the FileInputStream and Base64OutputStream explicitly to print = if required but buffering is handled by IOUtils.copy().
Either the file is too big, or your heap is too small, or you've got a memory leak.
If this only happens with really big files, put something into your code to check the file size and reject files that are unreasonably big.
If this happens with small files, increase your heap size by using the -Xmx command line option when you launch the JVM. (If this is in a web container or some other framework, check the documentation on how to do it.)
If the file recurs, especially with small files, the chances are that you've got a memory leak.
The other point that should be made is that your current approach entails holding two complete copies of the file in memory. You should be able to reduce the memory usage, though you'll typically need a stream-based Base64 encoder to do this. (It depends on which flavor of the base64 encoding you are using ...)
This page describes a stream-based Base64 encoder / decoder library, and includes lnks to some alternatives.
Well, do not do it for the whole file at once.
Base64 works on 3 bytes at a time, so you can read your file in batches of "multiple of 3" bytes, encode them and repeat until you finish the file:
// the base64 encoding - acceptable estimation of encoded size
StringBuilder sb = new StringBuilder(file.length() / 3 * 4);
FileInputStream fin = null;
try {
fin = new FileInputStream("some.file");
// Max size of buffer
int bSize = 3 * 512;
// Buffer
byte[] buf = new byte[bSize];
// Actual size of buffer
int len = 0;
while((len = fin.read(buf)) != -1) {
byte[] encoded = Base64.encodeBase64(buf);
// Although you might want to write the encoded bytes to another
// stream, otherwise you'll run into the same problem again.
sb.append(new String(buf, 0, len));
}
} catch(IOException e) {
if(null != fin) {
fin.close();
}
}
String base64EncodedFile = sb.toString();
You are not reading the whole file, just the first few kb. The read method returns how many bytes were actually read. You should call read in a loop until it returns -1 to be sure that you have read everything.
The file is too big for both it and its base64 encoding to fit in memory. Either
process the file in smaller pieces or
increase the memory available to the JVM with the -Xmx switch, e.g.
java -Xmx1024M YourProgram
This is best code to upload image of more size
bitmap=Bitmap.createScaledBitmap(bitmap, 100, 100, true);
ByteArrayOutputStream stream = new ByteArrayOutputStream();
bitmap.compress(Bitmap.CompressFormat.PNG, 100, stream); //compress to which format you want.
byte [] byte_arr = stream.toByteArray();
String image_str = Base64.encodeBytes(byte_arr);
Well, looks like your file is too large to keep the multiple copies necessary for an in-memory Base64 encoding in the available heap memory at the same time. Given that this is for a mobile device, it's probably not possible to increase the heap, so you have two options:
make the file smaller (much smaller)
Do it in a stram-based way so that you're reading from an InputStream one small part of the file at a time, encode it and write it to an OutputStream, without ever keeping the enitre file in memory.
In Manifest in applcation tag write following
android:largeHeap="true"
It worked for me
Java 8 added Base64 methods, so Apache Commons is no longer needed to encode large files.
public static void encodeFileToBase64(String inputFile, String outputFile) {
try (OutputStream out = Base64.getEncoder().wrap(new FileOutputStream(outputFile))) {
Files.copy(Paths.get(inputFile), out);
} catch (IOException e) {
throw new UncheckedIOException(e);
}
}

Reading and writing binary file in Java (seeing half of the file being corrupted)

I have some working code in python that I need to convert to Java.
I have read quite a few threads on this forum but could not find an answer. I am reading in a JPG image and converting it into a byte array. I then write this buffer it to a different file. When I compare the written files from both Java and python code, the bytes at the end do not match. Please let me know if you have a suggestion. I need to use the byte array to pack the image into a message that needs to be sent over to a remote server.
Java code (Running on Android)
Reading the file:
File queryImg = new File(ImagePath);
int imageLen = (int)queryImg.length();
byte [] imgData = new byte[imageLen];
FileInputStream fis = new FileInputStream(queryImg);
fis.read(imgData);
Writing the file:
FileOutputStream f = new FileOutputStream(new File("/sdcard/output.raw"));
f.write(imgData);
f.flush();
f.close();
Thanks!
InputStream.read is not guaranteed to read any particular number of bytes and may read less than you asked it to. It returns the actual number read so you can have a loop that keeps track of progress:
public void pump(InputStream in, OutputStream out, int size) {
byte[] buffer = new byte[4096]; // Or whatever constant you feel like using
int done = 0;
while (done < size) {
int read = in.read(buffer);
if (read == -1) {
throw new IOException("Something went horribly wrong");
}
out.write(buffer, 0, read);
done += read;
}
// Maybe put cleanup code in here if you like, e.g. in.close, out.flush, out.close
}
I believe Apache Commons IO has classes for doing this kind of stuff so you don't need to write it yourself.
Your file length might be more than int can hold and than you end up having wrong array length, hence not reading entire file into the buffer.

Categories