Reading InputStream bytes and writing to ByteArrayOutputStream - java

I have code block to read mentioned number of bytes from an InputStream and return a byte[] using ByteArrayOutputStream. When I'm writing that byte[] array to a file, resultant file on the filesystem seems broken. Can anyone help me find out problem in the below code block.
public byte[] readWrite(long bytes, InputStream in) throws IOException {
ByteArrayOutputStream bos = new ByteArrayOutputStream();
int maxReadBufferSize = 8 * 1024; //8KB
long numReads = bytes/maxReadBufferSize;
long numRemainingRead = bytes % maxReadBufferSize;
for(int i=0; i<numReads; i++) {
byte bufr[] = new byte[maxReadBufferSize];
int val = in.read(bufr, 0, bufr.length);
if(val != -1) {
bos.write(bufr);
}
}
if(numRemainingRead > 0) {
byte bufr[] = new byte[(int)numRemainingRead];
int val = in.read(bufr, 0, bufr.length);
if(val != -1) {
bos.write(bufr);
}
}
return bos.toByteArray();
}

My understanding of the problem statement
Read bytes number of bytes from the given InputStream in a ByteArrayOutputStream.
Finally, return a byte array.
Key observations
A lot of work is done to make sure bytes are read in chunks of 8KB.
Also, the last remaining chunk of odd size is read separately.
A lot of work is also done to make sure we are reading from the correct offset.
My views
Unless we are reading a very large file (>10MB) I don't see a valid reason for reading in chunks of 8KB.
Let Java libraries do all the hard work of maintaining offset and making sure we don't read outside limits.
Eg: We don't have to give offset, simply do inputStream.read(b) over and over, the next byte array of size b.length will be read. Similarly, we can simply write to outputStream.
Code
public byte[] readWrite(long bytes, InputStream in) throws IOException {
ByteArrayOutputStream bos = new ByteArrayOutputStream();
byte[] buffer = new byte[(int)bytes];
is.read(buffer);
bos.write(buffer);
return bos.toByteArray();
}
References
About InputStreams
Byte Array to Human Readable Format

Related

Inflate output to file performance improvement

I am following code similar to below. Looking around at different implementations it seems that most people are performing the same operations by doing the byte copy. Is there possible a faster way to handle inflating from a file and printing back out to file?
public static String unzipString(InputStream in) {
try {
int length = (int) in.readUBits( 16 );
// Add extra byte to array when Inflater is set to true
byte[] data = in.read( length );
ByteArrayInputStream bin = new ByteArrayInputStream(input);
InflaterInputStream in = new InflaterInputStream(bin);
FileoutputStream bout = new FileoutputStream(this.file);
int b;
while ((b = in.read()) != -1) {
bout.write(b);
}
bout.close();
} catch (IOException io) {
return null;
}
}
copying one byte at a time is always going to be a very slow way to process a file. I suggest you use a buffer of say 8 KB instead.
try (FileOutputStream fout = new FileOutputStream(this.file)) {
byte[] bytes = new byte[8192];
for (int len; (len = in.read(bytes)) != -1;)
fout.write(b, 0, len);
}
BTW To make it faster you could avoid copying the byte[] in the first place with InputStream which wraps in but reads exactly length bytes.

IllegalArgumentException using Java8 Base64 decoder

I wanted to use Base64.java to encode and decode files. Encode.wrap(InputStream) and decode.wrap(InputStream) worked but runned slowly. So I used following code.
public static void decodeFile(String inputFileName,
String outputFileName)
throws FileNotFoundException, IOException {
Base64.Decoder decoder = Base64.getDecoder();
InputStream in = new FileInputStream(inputFileName);
OutputStream out = new FileOutputStream(outputFileName);
byte[] inBuff = new byte[BUFF_SIZE]; //final int BUFF_SIZE = 1024;
byte[] outBuff = null;
while (in.read(inBuff) > 0) {
outBuff = decoder.decode(inBuff);
out.write(outBuff);
}
out.flush();
out.close();
in.close();
}
However, it always throws
Exception in thread "AWT-EventQueue-0" java.lang.IllegalArgumentException: Input byte array has wrong 4-byte ending unit
at java.util.Base64$Decoder.decode0(Base64.java:704)
at java.util.Base64$Decoder.decode(Base64.java:526)
at Base64Coder.JavaBase64FileCoder.decodeFile(JavaBase64FileCoder.java:69)
...
After I changed final int BUFF_SIZE = 1024; into final int BUFF_SIZE = 3*1024;, the code worked. Since "BUFF_SIZE" is also used to encode file, I believe there were something wrong with the file encoded (1024 % 3 = 1, which means paddings are added in the middle of the file).
Also, as #Jon Skeet and #Tagir Valeev mentioned, I should not ignore the return value from InputStream.read(). So, I modified the code as below.
(However, I have to mention that the code does run much faster than using wrap(). I noticed the speed difference because I had coded and intensively used Base64.encodeFile()/decodeFile() long before jdk8 was released. Now, my buffed jdk8 code runs as fast as my original code. So, I do not know what is going on with wrap()... )
public static void decodeFile(String inputFileName,
String outputFileName)
throws FileNotFoundException, IOException
{
Base64.Decoder decoder = Base64.getDecoder();
InputStream in = new FileInputStream(inputFileName);
OutputStream out = new FileOutputStream(outputFileName);
byte[] inBuff = new byte[BUFF_SIZE];
byte[] outBuff = null;
int bytesRead = 0;
while (true)
{
bytesRead = in.read(inBuff);
if (bytesRead == BUFF_SIZE)
{
outBuff = decoder.decode(inBuff);
}
else if (bytesRead > 0)
{
byte[] tempBuff = new byte[bytesRead];
System.arraycopy(inBuff, 0, tempBuff, 0, bytesRead);
outBuff = decoder.decode(tempBuff);
}
else
{
out.flush();
out.close();
in.close();
return;
}
out.write(outBuff);
}
}
Special thanks to #Jon Skeet and #Tagir Valeev.
I strongly suspect that the problem is that you're ignoring the return value from InputStream.read, other than to check for the end of the stream. So this:
while (in.read(inBuff) > 0) {
// This always decodes the *complete* buffer
outBuff = decoder.decode(inBuff);
out.write(outBuff);
}
should be
int bytesRead;
while ((bytesRead = in.read(inBuff)) > 0) {
outBuff = decoder.decode(inBuff, 0, bytesRead);
out.write(outBuff);
}
I wouldn't expect this to be any faster than using wrap though.
Try to use decode.wrap(new BufferedInputStream(new FileInputStream(inputFileName))). With buffering it should be at least as fast as your manually crafted version.
As for why your code doesn't work: that's because the last chunk is likely to be shorter than 1024 bytes, but you try to decode the whole byte[] array. See the #JonSkeet answer for details.
Well, I changed
"final int BUFF_SIZE = 1024;"
into
"final int BUFF_SIZE = 1024 * 3;"
It worked!
So, I guess probabaly there is something wrong with padding... I mean, when encoding the file, (since 1024 % 3 = 1) there must be paddings. And those might raise problems when decoding...
You should records the number of bytes you have read, beside this,
You should be sure that your buffer size is divisible for 3, cause in Base64, every 3 bytes have four output(64 is 2^6, and 3*8 equals 4*6), by doing this, you can avoid padding problems.( In this way your output will not have the wrong ending of "=")

Trim Padding From ByteArrayOutputStream

I'm working with Amazon S3 and would like to upload an InputStream (which requires counting the number of bytes I'm sending).
public static boolean uploadDataTo(String bucketName, String key, String fileName, InputStream stream) {
ByteArrayOutputStream out = new ByteArrayOutputStream();
byte[] buffer = new byte[1];
try {
while (stream.read(buffer) != -1) { // copy from stream to buffer
out.write(buffer); // copy from buffer to byte array
}
} catch (Exception e) {
UtilityFunctionsObject.writeLogException(null, e);
}
byte[] result = out.toByteArray(); // we needed all that just for length
int bytes = result.length;
IO.close(out);
InputStream uploadStream = new ByteArrayInputStream(result);
....
}
I was told copying a byte at a time is highly inefficient (obvious for large files). I can't make it more because it will add padding to the ByteArrayOutputStream, which I can't strip out. I can strip it out from result, but how can I do it safely? If I use an 8KB buffer, can I just strip out the right most buffer[i] == 0? Or is there a better way to do this? Thanks!
Using Java 7 on Windows 7 x64.
You can do something like this:
int read = 0;
while ((read = stream.read(buffer)) != -1) {
out.write(buffer, 0, read);
}
stream.read() returns the number of bytes that have been written into buffer. You can pass this information to the len parameter of out.write(). So you make sure that you write only the bytes you have read from the stream.
Use Jakarta Commons IOUtils to copy from the input stream to the byte array stream in a single step. It will use an efficient buffer, and not write any excess bytes.
If you want efficiency you could process the file as you read it. I would replace uploadStream with stream and remove the rest of the code.
If you need some buffering you can do this
InputStream uploadStream = new BufferedInputStream(stream);
the default buffer size is 8 KB.
If you want the length use File.length();
long length = new File(fileName).length();

Java InputStream reading problem

I have a Java class, where I'm reading data in via an InputStream
byte[] b = null;
try {
b = new byte[in.available()];
in.read(b);
} catch (IOException e) {
e.printStackTrace();
}
It works perfectly when I run my app from the IDE (Eclipse).
But when I export my project and it's packed in a JAR, the read command doesn't read all the data. How could I fix it?
This problem mostly occurs when the InputStream is a File (~10kb).
Thanks!
Usually I prefer using a fixed size buffer when reading from input stream. As evilone pointed out, using available() as buffer size might not be a good idea because, say, if you are reading a remote resource, then you might not know the available bytes in advance. You can read the javadoc of InputStream to get more insight.
Here is the code snippet I usually use for reading input stream:
byte[] buffer = new byte[BUFFER_SIZE];
int bytesRead = 0;
while ((bytesRead = in.read(buffer)) >= 0){
for (int i = 0; i < bytesRead; i++){
//Do whatever you need with the bytes here
}
}
The version of read() I'm using here will fill the given buffer as much as possible and
return number of bytes actually read. This means there is chance that your buffer may contain trailing garbage data, so it is very important to use bytes only up to bytesRead.
Note the line (bytesRead = in.read(buffer)) >= 0, there is nothing in the InputStream spec saying that read() cannot read 0 bytes. You may need to handle the case when read() reads 0 bytes as special case depending on your case. For local file I never experienced such case; however, when reading remote resources, I actually seen read() reads 0 bytes constantly resulting the above code into an infinite loop. I solved the infinite loop problem by counting the number of times I read 0 bytes, when the counter exceed a threshold I will throw exception. You may not encounter this problem, but just keep this in mind :)
I probably will stay away from creating new byte array for each read for performance reasons.
read() will return -1 when the InputStream is depleted. There is also a version of read which takes an array, this allows you to do chunked reads. It returns the number of bytes actually read or -1 when at the end of the InputStream. Combine this with a dynamic buffer such as ByteArrayOutputStream to get the following:
InputStream in = ...
ByteArrayOutputStream buffer = new ByteArrayOutputStream();
int read;
byte[] input = new byte[4096];
while ( -1 != ( read = in.read( input ) ) ) {
buffer.write( input, 0, read );
}
input = buffer.toByteArray()
This cuts down a lot on the number of methods you have to invoke and allows the ByteArrayOutputStream to grow its internal buffer faster.
File file = new File("/path/to/file");
try {
InputStream is = new FileInputStream(file);
byte[] bytes = IOUtils.toByteArray(is);
System.out.println("Byte array size: " + bytes.length);
} catch (IOException e) {
e.printStackTrace();
}
Below is a snippet of code that downloads a file (*. Png, *. Jpeg, *. Gif, ...) and write it in BufferedOutputStream that represents the HttpServletResponse.
BufferedInputStream inputStream = bo.getBufferedInputStream(imageFile);
try {
ByteArrayOutputStream buffer = new ByteArrayOutputStream();
int bytesRead = 0;
byte[] input = new byte[DefaultBufferSizeIndicator.getDefaultBufferSize()];
while (-1 != (bytesRead = inputStream.read(input))) {
buffer.write(input, 0, bytesRead);
}
input = buffer.toByteArray();
response.reset();
response.setBufferSize(DefaultBufferSizeIndicator.getDefaultBufferSize());
response.setContentType(mimeType);
// Here's the secret. Content-Length should equal the number of bytes read.
response.setHeader("Content-Length", String.valueOf(buffer.size()));
response.setHeader("Content-Disposition", "inline; filename=\"" + imageFile.getName() + "\"");
BufferedOutputStream outputStream = new BufferedOutputStream(response.getOutputStream(), DefaultBufferSizeIndicator.getDefaultBufferSize());
try {
outputStream.write(input, 0, buffer.size());
} finally {
ImageBO.close(outputStream);
}
} finally {
ImageBO.close(inputStream);
}
Hope this helps.

read a file byte by byte then perform some operation every n bytes

I would like to know how can I read a file byte by byte then perform some operation every n bytes.
for example:
Say I have a file of size = 50 bytes, I want to divide it into blocks each of n bytes. Then each block is sent to a function for some operations to be done on those bytes. The blocks are to be created during the read process and sent to the function when the block reaches n bytes so that I don`t use much memory for storing all blocks.
I want the output of the function to be written/appended on a new file.
This is what I've reached to read, yet I don't know it it is right:
fc = new JFileChooser();
File f = fc.getSelectedFile();
FileInputStream in = new FileInputStream(f);
byte[] b = new byte[16];
in.read(b);
I haven't done anything yet for the write process.
You're on the right lines. Consider wrapping your FileInputStream with a BufferedInputStream, which improve I/O efficiency by reading the file in chunks.
The next step is to check the number of bytes read (returned by your call to read) and to hand-off the array to the processing function. Obviously you'll need to pass the number of bytes read to this method too in case the array was only partially populated.
So far your code looks OK. For reading binary files (as opposed to text files) you should indeed use FileInputStream (for reading text files, you should use a Reader, such as FileReader).
Note that you should check the return value from in.read(b);, because it might read less than 16 bytes if there are less than 16 bytes left at the end of the file.
Ofcourse you should add a loop to the program that keeps reading blocks of bytes until you reach the end of the file.
To write data to a binary file, use FileOutputStream. That class has a constructor that you can pass a flag to indicate that you want to append to an existing file:
FileOutputStream out = new FileOutputStream("output.bin", true);
Also, don't forget to call close() on the FileInputStream and FileOutputStream when you are done.
See the Java API documentation, especially the classes in the java.io package.
I believe that this will work:
final int blockSize = // some calculation
byte[] block = new byte[blockSize];
InputStream is = new FileInputStream(f);
try {
int ret = -1;
do {
int bytesRead = 0;
while (bytesRead < blockSize) {
ret = is.read(block, bytesRead, blockSize - bytesRead);
if (ret < 0)
break; // no more data
bytesRead += ret;
}
myFunction(block, bytesRead);
} while (0 <= ret);
}
finally {
is.close();
}
This code will call myFunction with blockSize bytes for all but possibly the last invocation.
It's a start.
You should check what read() returns. It can read fewer bytes than the size of the array, and also indicate that the end of the file is reached.
Obviously, you need to read() in a loop...
It might be a good idea to reuse the array, but that requires that the part that reads the array copies what it needs, rather than just keeping a reference to the array.
I think this is what you migth need
void readFile(String path, int n) {
try {
File f = new File(path);
FileInputStream fis = new FileInputStream(f);
int ret = 0;
byte[] array = new byte[n];
while(ret > -1) {
ret = fis.read(array);
doSomething(array, ret);
}
fis.close();
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
}

Categories