Java - Read file by chunks? - java

I know how to read a file by bytes but cannot find a example how to read it in chunks of bytes. I have a byte array, and i want to read the file by 512bytes and send them over a socket.
I have tried by reading total bytes of file and then subtracting 512 bytes until i got a chunk that was less than 512 bytes and signaled EOF and end of transfer.
I am trying to implement a TFTP, where data is sent in 512 byte chunks.
Anyhow would be thankful for a example.

You ... read 512 bytes at a time.
char[] myBuffer = new char[512];
int bytesRead = 0;
BufferedReader in = new BufferedReader(new FileReader("foo.txt"));
while ((bytesRead = in.read(myBuffer,0,512)) != -1)
{
...
}

You can use the appropriate read() method from the input stream, for example FileInputStream supports a read(byte[]) to read a chunk of bytes.
something like: You may want to wrap the input stream in a BufferedInputStream if you wanted to guarantee 512 byte blocks (the constructor takes a block size argument).
byte[] buffer = new byte[512];
FileInputStream in = new FileInputStream("some_file");
int rc = in.read(buffer);
while(rc != -1)
{
// rc should contain the number of bytes read in this operation.
// do stuff...
// next read
rc = in.read(buffer);
}

Using InputStream you can read in an array of given size and limit the reading to this size.
Read here: http://docs.oracle.com/javase/7/docs/api/java/io/InputStream.html

Related

Reading InputStream bytes and writing to ByteArrayOutputStream

I have code block to read mentioned number of bytes from an InputStream and return a byte[] using ByteArrayOutputStream. When I'm writing that byte[] array to a file, resultant file on the filesystem seems broken. Can anyone help me find out problem in the below code block.
public byte[] readWrite(long bytes, InputStream in) throws IOException {
ByteArrayOutputStream bos = new ByteArrayOutputStream();
int maxReadBufferSize = 8 * 1024; //8KB
long numReads = bytes/maxReadBufferSize;
long numRemainingRead = bytes % maxReadBufferSize;
for(int i=0; i<numReads; i++) {
byte bufr[] = new byte[maxReadBufferSize];
int val = in.read(bufr, 0, bufr.length);
if(val != -1) {
bos.write(bufr);
}
}
if(numRemainingRead > 0) {
byte bufr[] = new byte[(int)numRemainingRead];
int val = in.read(bufr, 0, bufr.length);
if(val != -1) {
bos.write(bufr);
}
}
return bos.toByteArray();
}
My understanding of the problem statement
Read bytes number of bytes from the given InputStream in a ByteArrayOutputStream.
Finally, return a byte array.
Key observations
A lot of work is done to make sure bytes are read in chunks of 8KB.
Also, the last remaining chunk of odd size is read separately.
A lot of work is also done to make sure we are reading from the correct offset.
My views
Unless we are reading a very large file (>10MB) I don't see a valid reason for reading in chunks of 8KB.
Let Java libraries do all the hard work of maintaining offset and making sure we don't read outside limits.
Eg: We don't have to give offset, simply do inputStream.read(b) over and over, the next byte array of size b.length will be read. Similarly, we can simply write to outputStream.
Code
public byte[] readWrite(long bytes, InputStream in) throws IOException {
ByteArrayOutputStream bos = new ByteArrayOutputStream();
byte[] buffer = new byte[(int)bytes];
is.read(buffer);
bos.write(buffer);
return bos.toByteArray();
}
References
About InputStreams
Byte Array to Human Readable Format

Trim Padding From ByteArrayOutputStream

I'm working with Amazon S3 and would like to upload an InputStream (which requires counting the number of bytes I'm sending).
public static boolean uploadDataTo(String bucketName, String key, String fileName, InputStream stream) {
ByteArrayOutputStream out = new ByteArrayOutputStream();
byte[] buffer = new byte[1];
try {
while (stream.read(buffer) != -1) { // copy from stream to buffer
out.write(buffer); // copy from buffer to byte array
}
} catch (Exception e) {
UtilityFunctionsObject.writeLogException(null, e);
}
byte[] result = out.toByteArray(); // we needed all that just for length
int bytes = result.length;
IO.close(out);
InputStream uploadStream = new ByteArrayInputStream(result);
....
}
I was told copying a byte at a time is highly inefficient (obvious for large files). I can't make it more because it will add padding to the ByteArrayOutputStream, which I can't strip out. I can strip it out from result, but how can I do it safely? If I use an 8KB buffer, can I just strip out the right most buffer[i] == 0? Or is there a better way to do this? Thanks!
Using Java 7 on Windows 7 x64.
You can do something like this:
int read = 0;
while ((read = stream.read(buffer)) != -1) {
out.write(buffer, 0, read);
}
stream.read() returns the number of bytes that have been written into buffer. You can pass this information to the len parameter of out.write(). So you make sure that you write only the bytes you have read from the stream.
Use Jakarta Commons IOUtils to copy from the input stream to the byte array stream in a single step. It will use an efficient buffer, and not write any excess bytes.
If you want efficiency you could process the file as you read it. I would replace uploadStream with stream and remove the rest of the code.
If you need some buffering you can do this
InputStream uploadStream = new BufferedInputStream(stream);
the default buffer size is 8 KB.
If you want the length use File.length();
long length = new File(fileName).length();

Read Socket stream data and read first 2 bytes, find rest of message length

How to read first 2 bytes from input stream and convert 2 bytes data into actual int length value, then read and copy the rest of message into byte array.
The rest of data array should be defined after reading first 2 bytes from the stream, does anyone know efficient logic?
Use a DataInputStream. Use the readUnsignedShort() method to return the length word, then the readFully() method to read the following data.
This creates a string from a byte array. Adapt as needed.
InputStream in;
try {
in = socket.getInputStream();
DataInputStream dis = new DataInputStream(in);
int len = dis.readInt();
byte[] data = new byte[len];
if (len > 0) {
dis.readFully(data);
}
String sReturn = new String(data);
}

Java TCP Socket receiving bytes with specified length

I am trying to first read 4 bytes(int) specifying the size of the message and then read the remaining bytes based on the byte count. I am using the following code to accomplish this:
DataInputStream dis = new DataInputStream(
mClientSocket.getInputStream());
// read the message length
int len = dis.readInt();
Log.i(TAG, "Reading bytes of length:" + len);
// read the message data
byte[] data = new byte[len];
if (len > 0) {
dis.readFully(data);
} else {
return "";
}
return new String(data);
Is there a better/efficient way of doing this?
From JavaDocs of readUTF:
First, two bytes are read and used to construct an unsigned 16-bit
*integer* in exactly the manner of the readUnsignedShort method . This
integer value is called the UTF length and specifies the number of
additional bytes to be read. These bytes are then converted to
characters by considering them in groups. The length of each group is
computed from the value of the first byte of the group. The byte
following a group, if any, is the first byte of the next group.
The only problem with this is that your protocol seems to only send 4 bytes for the payload length. Perhaps you can do a similar method but increase the size of length sentinel read to 4 bytes/32-bits.
Also, I see that you are just doing new String(bytes) which works fine as long as the encoding of the data is the same as "the platform's default charset." See javadoc So it would be much safer to just ensure that you are encoding it correctly(e.g. if you know that the sender sends it as UTF-8 then do new String(bytes,"UTF-8") instead).
How about
DataInputStream dis = new DataInputStream(new BufferedInputStream(
mClientSocket.getInputStream()));
return dis.readUTF();
You can use read(byte[] b, int off, int len) like this
byte[] data = new byte[len];
dis.read(data,0,len);

Java InputStream reading problem

I have a Java class, where I'm reading data in via an InputStream
byte[] b = null;
try {
b = new byte[in.available()];
in.read(b);
} catch (IOException e) {
e.printStackTrace();
}
It works perfectly when I run my app from the IDE (Eclipse).
But when I export my project and it's packed in a JAR, the read command doesn't read all the data. How could I fix it?
This problem mostly occurs when the InputStream is a File (~10kb).
Thanks!
Usually I prefer using a fixed size buffer when reading from input stream. As evilone pointed out, using available() as buffer size might not be a good idea because, say, if you are reading a remote resource, then you might not know the available bytes in advance. You can read the javadoc of InputStream to get more insight.
Here is the code snippet I usually use for reading input stream:
byte[] buffer = new byte[BUFFER_SIZE];
int bytesRead = 0;
while ((bytesRead = in.read(buffer)) >= 0){
for (int i = 0; i < bytesRead; i++){
//Do whatever you need with the bytes here
}
}
The version of read() I'm using here will fill the given buffer as much as possible and
return number of bytes actually read. This means there is chance that your buffer may contain trailing garbage data, so it is very important to use bytes only up to bytesRead.
Note the line (bytesRead = in.read(buffer)) >= 0, there is nothing in the InputStream spec saying that read() cannot read 0 bytes. You may need to handle the case when read() reads 0 bytes as special case depending on your case. For local file I never experienced such case; however, when reading remote resources, I actually seen read() reads 0 bytes constantly resulting the above code into an infinite loop. I solved the infinite loop problem by counting the number of times I read 0 bytes, when the counter exceed a threshold I will throw exception. You may not encounter this problem, but just keep this in mind :)
I probably will stay away from creating new byte array for each read for performance reasons.
read() will return -1 when the InputStream is depleted. There is also a version of read which takes an array, this allows you to do chunked reads. It returns the number of bytes actually read or -1 when at the end of the InputStream. Combine this with a dynamic buffer such as ByteArrayOutputStream to get the following:
InputStream in = ...
ByteArrayOutputStream buffer = new ByteArrayOutputStream();
int read;
byte[] input = new byte[4096];
while ( -1 != ( read = in.read( input ) ) ) {
buffer.write( input, 0, read );
}
input = buffer.toByteArray()
This cuts down a lot on the number of methods you have to invoke and allows the ByteArrayOutputStream to grow its internal buffer faster.
File file = new File("/path/to/file");
try {
InputStream is = new FileInputStream(file);
byte[] bytes = IOUtils.toByteArray(is);
System.out.println("Byte array size: " + bytes.length);
} catch (IOException e) {
e.printStackTrace();
}
Below is a snippet of code that downloads a file (*. Png, *. Jpeg, *. Gif, ...) and write it in BufferedOutputStream that represents the HttpServletResponse.
BufferedInputStream inputStream = bo.getBufferedInputStream(imageFile);
try {
ByteArrayOutputStream buffer = new ByteArrayOutputStream();
int bytesRead = 0;
byte[] input = new byte[DefaultBufferSizeIndicator.getDefaultBufferSize()];
while (-1 != (bytesRead = inputStream.read(input))) {
buffer.write(input, 0, bytesRead);
}
input = buffer.toByteArray();
response.reset();
response.setBufferSize(DefaultBufferSizeIndicator.getDefaultBufferSize());
response.setContentType(mimeType);
// Here's the secret. Content-Length should equal the number of bytes read.
response.setHeader("Content-Length", String.valueOf(buffer.size()));
response.setHeader("Content-Disposition", "inline; filename=\"" + imageFile.getName() + "\"");
BufferedOutputStream outputStream = new BufferedOutputStream(response.getOutputStream(), DefaultBufferSizeIndicator.getDefaultBufferSize());
try {
outputStream.write(input, 0, buffer.size());
} finally {
ImageBO.close(outputStream);
}
} finally {
ImageBO.close(inputStream);
}
Hope this helps.

Categories