I want to read a file into a byte array. So, I am reading it using:
int len1 = (int)(new File(filename).length());
FileInputStream fis1 = new FileInputStream(filename);
byte buf1[] = new byte[len1];
fis1.read(buf1);
However, it is realy very slow. Can anyone inform me a very fast approach (possibly best one) to read a file into byte array. I can use java library also if needed.
Edit: Is there any benchmark which one is faster (including library approach).
It is not very slow, at least there is not way to make it faster. BUT it is wrong. If file is big enough the method read() will not return all bytes from fist call. This method returns number of bytes it managed to read as return value.
The right way is to call this method in loop:
public static void copy(InputStream input,
OutputStream output,
int bufferSize)
throws IOException {
byte[] buf = new byte[bufferSize];
int bytesRead = input.read(buf);
while (bytesRead != -1) {
output.write(buf, 0, bytesRead);
bytesRead = input.read(buf);
}
output.flush();
}
call this as following:
ByteArrayOutputStream baos = new ByteArrayOutputStream();
copy(new FileInputStream(myfile), baos);
byte[] bytes = baos.toByteArray();
Something like this is implemented in a lot of packages, e.g. FileUtils.readFileToByteArray() mentioned by #Andrey Borisov (+1)
EDIT
I think that reason for slowness in your case is the fact that you create so huge array. Are you sure you really need it? Try to re-think your design. I believe that you do not have to read this file into array and can process data incrementally.
apache commons-io FileUtils.readFileToByteArray
Related
//Reading a image file from #drawable res folder and writing to a file on external sd card
//below one works no doubt but I want to imrpove it:
OutputStream os = new FileOutputStream(file); //File file.........
InputStream is =getResources().openRawResource(R.drawable.an_image);
byte[] b = new byte[is.available()];
is.read(b);
os.write(b);
is.close();
os.close();
In above code I am using basic io classes to read and write. My question is what can I do in order to able to use wrapper classes like say DataInputStream/ BufferedReaderd or PrintStream / BufferedWriter /PrintWriter.
As openRawResources(int id ) returns InputStream ;
to read a file from res I either need to typecast like this:
DataInputStream is = (DataInputStream) getResources().openRawResource(R.drawble.an_image));
or I can link the stream directly like this:
DataInputStream is = new DataInputStream(getResources().openRawResource(R.drawable.greenball));
and then I may do this to write it to a file on sd card:
PrintStream ps =new PrintStream (new FileOutputStream(file));
while(s=is.readLine()!=null){
ps.print(s);
}
So is that correct approach ? which one is better? Is there a better way?better practice..convention?
Thanks!!!
If openRawResource() is documented to return an InputStream then you cannot rely on that result to be any more specific kind of InputStream, and in particular, you cannot rely on it to be a DataInputStream. Casting does not change that; it just gives you the chance to experience interesting and exciting exceptions. If you want a DataInputStream wrapping the the result of openRawResource() then you must obtain it via the DataInputStream constructor. Similarly for any other wrapper stream.
HOWEVER, do note that DataInputStream likely is not the class you want. It is appropriate for reading back data that were originally written via a DataOutputStream, but it is inappropriate (or at least offers no advantages over any other InputStream) for reading general data.
Furthermore, your use of InputStream.available() is incorrect. That method returns the number of bytes that can currently be read from the stream without blocking, which has only a weak relationship with the total number of bytes that could be read from the stream before it is exhausted (if indeed it ever is).
Moreover, your code is also on shaky ground where it assumes that InputStream.read(byte[]) will read enough bytes to fill the array. It probably will, since that many bytes were reported available, but that's not guaranteed. To copy from one stream to another, you should instead use code along these lines:
private final static int BUFFER_SIZE = 2048;
void copyStream(InputStream in, OutputStream out) throws IOException {
byte[] buffer = new byte[BUFFER_SIZE];
int nread;
while ( (nread = in.read(buffer) != 0 ) do {
out.write(buffer, 0, nread);
}
}
I know how to get the inputstream for a given classpath resource, read from the inputstream until i reach the end, but it looks like a very common problem, and i wonder if there an API that I don't know, or a library that would make things as simple as
byte[] data = ResourceUtils.getResourceAsBytes("/assets/myAsset.bin")
or
byte[] data = StreamUtils.readStreamToEnd(myInputStream)
for example!
Java 9 native implementation:
byte[] data = this.getClass().getClassLoader().getResourceAsStream("/assets/myAsset.bin").readAllBytes();
Have a look at Google guava ByteStreams.toByteArray(INPUTSTREAM), this is might be what you want.
Although i agree with Andrew Thompson, here is a native implementation that works since Java 7 and uses the NIO-API:
byte[] data = Files.readAllBytes(Paths.get(this.getClass().getClassLoader().getResource("/assets/myAsset.bin").toURI()));
Take a look at Apache IOUtils - it has a bunch of methods to work with streams
I usually use the following two approaches to convert Resource into byte[] array.
1 - approach
What you need is to first call getInputStream() on Resource object, and then pass that to convertStreamToByteArray method like below....
InputStream stream = resource.getInputStream();
long size = resource.getFile().lenght();
byte[] byteArr = convertStreamToByteArray(stream, size);
public byte[] convertStreamToByteArray(InputStream stream, long size) throws IOException {
// check to ensure that file size is not larger than Integer.MAX_VALUE.
if (size > Integer.MAX_VALUE) {
return new byte[0];
}
byte[] buffer = new byte[(int)size];
ByteArrayOutputStream os = new ByteArrayOutputStream();
int line = 0;
// read bytes from stream, and store them in buffer
while ((line = stream.read(buffer)) != -1) {
// Writes bytes from byte array (buffer) into output stream.
os.write(buffer, 0, line);
}
stream.close();
os.flush();
os.close();
return os.toByteArray();
}
2 - approach
As Konstantin V. Salikhov suggested, you could use org.apache.commons.io.IOUtils and call its IOUtils.toByteArray(stream) static method and pass to it InputStream object like this...
byte[] byteArr = IOUtils.toByteArray(stream);
Note - Just thought I'll mention this that under the hood toByteArray(...) checks to ensure that file size is not larger than Integer.MAX_VALUE, so you don't have to check for this.
Commonly Java methods will accept an InputStream. In that majority of cases, I would recommend passing the stream directly to the method of interest.
Many of those same methods will also accept an URL (e.g. obtained from getResource(String)). That can sometimes be better, since a variety of the methods will require a repositionable InputStream and there are times that the stream returned from getResourceAsStream(String) will not be repositionable.
what's the probably fastest way of reading relatively huge files with Java's I/O-methods? My current solution uses the BufferedInputStream saving to an byte-array with 1024 bytes allocated to it. Each buffer is than saved in an ArrayList for later use. The whole process is called via a separate thread (callable-interface).
Not very fast though.
ArrayList<byte[]> outputArr = new ArrayList<byte[]>();
try {
BufferedInputStream reader = new BufferedInputStream(new FileInputStream (dir+filename));
byte[] buffer = new byte[LIMIT]; // == 1024
int i = 0;
while (reader.available() != 0) {
reader.read(buffer);
i++;
if (i <= LIMIT){
outputArr.add(buffer);
i = 0;
buffer = null;
buffer = new byte[LIMIT];
}
else continue;
}
System.out.println("FileReader-Elements: "+outputArr.size()+" w. "+buffer.length+" byte each.");
I would use a memory mapped file which is fast enough to do in the same thread.
final FileChannel channel = new FileInputStream(fileName).getChannel();
MappedByteBuffer buffer = channel.map(FileChannel.MapMode.READ_ONLY, 0, channel.size());
// when finished
channel.close();
This assumes the file is smaller than 2 GB and will take 10 milli-seconds or less.
Don't use available(): it's not reliable. And don't ignore the result of the read() method: it tells you how many bytes were actually read. And if you want to read everything in memory, use a ByteArrayOutputStream rather than using a List<byte[]>:
ByteArrayOutputStream baos = new ByteArrayOutputStream();
int read;
while ((read = reader.read(buffer)) >= 0) {
baos.write(buffer, 0, read);
}
byte[] everything = baos.toByteArray();
I think 1024 is a bit small as a buffer size. I would use a larger buffer (something like 16 KB or 32KB)
Note that Apache commons IO and Guava have utility methods that do this for you, and have been optimized already.
Have a look at Java NIO (Non-Blocking Input/Output) API. Also, this question might prove being useful.
I don't have much experience with IO, but I've heard that NIO is much more efficient way of handling large sets of data.
This question already has answers here:
Easy way to write contents of a Java InputStream to an OutputStream
(24 answers)
Closed 3 years ago.
FileInputStream in = new FileInputStream(myFile);
ByteArrayOutputStream out = new ByteArrayOutputStream();
Question: How can I read everything from in into out in a way which is not a hand-crafted loop with my own byte buffer?
Java 9 (and later) answer (docs):
in.transferTo(out);
Seems they finally realized that this functionality is so commonly needed that it’d better be built in. The method returns the number of bytes copied in case you need to know.
Write one method to do this, and call it from everywhere which needs the functionality. Guava already has code for this, in ByteStreams.copy. I'm sure just about any other library with "general" IO functionality has it too, but Guava's my first "go-to" library where possible. It rocks :)
In Apache Commons / IO, you can do it using IOUtils.copy(in, out):
InputStream in = new FileInputStream(myFile);
OutputStream out = new ByteArrayOutputStream();
IOUtils.copy(in, out);
But I agree with Jon Skeet, I'd rather use Guava's ByteStreams.copy(in, out)
So what Guava's ByteStreams.copy(in, out) does:
private static final int BUF_SIZE = 0x1000; // 4K
public static long copy(InputStream from, OutputStream to)
throws IOException {
checkNotNull(from);
checkNotNull(to);
byte[] buf = new byte[BUF_SIZE];
long total = 0;
while (true) {
int r = from.read(buf);
if (r == -1) {
break;
}
to.write(buf, 0, r);
total += r;
}
return total;
}
In my project I used this method:
private static void copyData(InputStream in, OutputStream out) throws Exception {
byte[] buffer = new byte[8 * 1024];
int len;
while ((len = in.read(buffer)) > 0) {
out.write(buffer, 0, len);
}
}
Alternatively to Guava one could use Apache Commons IO (old), and Apache Commons IOUtils (new as advised in the comment).
I'd use the loop, instead of importing new classes, or adding libraries to my project. The library function is probably also implemented with a loop. But that's just my personal taste.
However, my question to you: what are you trying to do? Think of the "big picture", if you want to put the entire contents of a file into a byte array, why not just do that? The size of the arrray is file.length(), and you don't need it to dynamically grow, hidden behind a ByteArrayOutputStream (unless your file is shared, and its contents can change while you read).
Another alternative: could you use a FileChannel and a ByteBuffer (java.nio)?
I have some working code in python that I need to convert to Java.
I have read quite a few threads on this forum but could not find an answer. I am reading in a JPG image and converting it into a byte array. I then write this buffer it to a different file. When I compare the written files from both Java and python code, the bytes at the end do not match. Please let me know if you have a suggestion. I need to use the byte array to pack the image into a message that needs to be sent over to a remote server.
Java code (Running on Android)
Reading the file:
File queryImg = new File(ImagePath);
int imageLen = (int)queryImg.length();
byte [] imgData = new byte[imageLen];
FileInputStream fis = new FileInputStream(queryImg);
fis.read(imgData);
Writing the file:
FileOutputStream f = new FileOutputStream(new File("/sdcard/output.raw"));
f.write(imgData);
f.flush();
f.close();
Thanks!
InputStream.read is not guaranteed to read any particular number of bytes and may read less than you asked it to. It returns the actual number read so you can have a loop that keeps track of progress:
public void pump(InputStream in, OutputStream out, int size) {
byte[] buffer = new byte[4096]; // Or whatever constant you feel like using
int done = 0;
while (done < size) {
int read = in.read(buffer);
if (read == -1) {
throw new IOException("Something went horribly wrong");
}
out.write(buffer, 0, read);
done += read;
}
// Maybe put cleanup code in here if you like, e.g. in.close, out.flush, out.close
}
I believe Apache Commons IO has classes for doing this kind of stuff so you don't need to write it yourself.
Your file length might be more than int can hold and than you end up having wrong array length, hence not reading entire file into the buffer.