How do I convert a java.io.File to a byte[]?
From JDK 7 you can use Files.readAllBytes(Path).
Example:
import java.io.File;
import java.nio.file.Files;
File file;
// ...(file is initialised)...
byte[] fileContent = Files.readAllBytes(file.toPath());
It depends on what best means for you. Productivity wise, don't reinvent the wheel and use Apache Commons. Which is here FileUtils.readFileToByteArray(File input).
Since JDK 7 - one liner:
byte[] array = Files.readAllBytes(Paths.get("/path/to/file"));
No external dependencies needed.
import java.io.RandomAccessFile;
RandomAccessFile f = new RandomAccessFile(fileName, "r");
byte[] b = new byte[(int)f.length()];
f.readFully(b);
Documentation for Java 8: http://docs.oracle.com/javase/8/docs/api/java/io/RandomAccessFile.html
Basically you have to read it in memory. Open the file, allocate the array, and read the contents from the file into the array.
The simplest way is something similar to this:
public byte[] read(File file) throws IOException, FileTooBigException {
if (file.length() > MAX_FILE_SIZE) {
throw new FileTooBigException(file);
}
ByteArrayOutputStream ous = null;
InputStream ios = null;
try {
byte[] buffer = new byte[4096];
ous = new ByteArrayOutputStream();
ios = new FileInputStream(file);
int read = 0;
while ((read = ios.read(buffer)) != -1) {
ous.write(buffer, 0, read);
}
}finally {
try {
if (ous != null)
ous.close();
} catch (IOException e) {
}
try {
if (ios != null)
ios.close();
} catch (IOException e) {
}
}
return ous.toByteArray();
}
This has some unnecessary copying of the file content (actually the data is copied three times: from file to buffer, from buffer to ByteArrayOutputStream, from ByteArrayOutputStream to the actual resulting array).
You also need to make sure you read in memory only files up to a certain size (this is usually application dependent) :-).
You also need to treat the IOException outside the function.
Another way is this:
public byte[] read(File file) throws IOException, FileTooBigException {
if (file.length() > MAX_FILE_SIZE) {
throw new FileTooBigException(file);
}
byte[] buffer = new byte[(int) file.length()];
InputStream ios = null;
try {
ios = new FileInputStream(file);
if (ios.read(buffer) == -1) {
throw new IOException(
"EOF reached while trying to read the whole file");
}
} finally {
try {
if (ios != null)
ios.close();
} catch (IOException e) {
}
}
return buffer;
}
This has no unnecessary copying.
FileTooBigException is a custom application exception.
The MAX_FILE_SIZE constant is an application parameters.
For big files you should probably think a stream processing algorithm or use memory mapping (see java.nio).
As someone said, Apache Commons File Utils might have what you are looking for
public static byte[] readFileToByteArray(File file) throws IOException
Example use (Program.java):
import org.apache.commons.io.FileUtils;
public class Program {
public static void main(String[] args) throws IOException {
File file = new File(args[0]); // assume args[0] is the path to file
byte[] data = FileUtils.readFileToByteArray(file);
...
}
}
If you don't have Java 8, and agree with me that including a massive library to avoid writing a few lines of code is a bad idea:
public static byte[] readBytes(InputStream inputStream) throws IOException {
byte[] b = new byte[1024];
ByteArrayOutputStream os = new ByteArrayOutputStream();
int c;
while ((c = inputStream.read(b)) != -1) {
os.write(b, 0, c);
}
return os.toByteArray();
}
Caller is responsible for closing the stream.
// Returns the contents of the file in a byte array.
public static byte[] getBytesFromFile(File file) throws IOException {
// Get the size of the file
long length = file.length();
// You cannot create an array using a long type.
// It needs to be an int type.
// Before converting to an int type, check
// to ensure that file is not larger than Integer.MAX_VALUE.
if (length > Integer.MAX_VALUE) {
// File is too large
throw new IOException("File is too large!");
}
// Create the byte array to hold the data
byte[] bytes = new byte[(int)length];
// Read in the bytes
int offset = 0;
int numRead = 0;
InputStream is = new FileInputStream(file);
try {
while (offset < bytes.length
&& (numRead=is.read(bytes, offset, bytes.length-offset)) >= 0) {
offset += numRead;
}
} finally {
is.close();
}
// Ensure all the bytes have been read in
if (offset < bytes.length) {
throw new IOException("Could not completely read file "+file.getName());
}
return bytes;
}
You can use the NIO api as well to do it. I could do this with this code as long as the total file size (in bytes) would fit in an int.
File f = new File("c:\\wscp.script");
FileInputStream fin = null;
FileChannel ch = null;
try {
fin = new FileInputStream(f);
ch = fin.getChannel();
int size = (int) ch.size();
MappedByteBuffer buf = ch.map(MapMode.READ_ONLY, 0, size);
byte[] bytes = new byte[size];
buf.get(bytes);
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} finally {
try {
if (fin != null) {
fin.close();
}
if (ch != null) {
ch.close();
}
} catch (IOException e) {
e.printStackTrace();
}
}
I think its very fast since its using MappedByteBuffer.
Simple way to do it:
File fff = new File("/path/to/file");
FileInputStream fileInputStream = new FileInputStream(fff);
// int byteLength = fff.length();
// In android the result of file.length() is long
long byteLength = fff.length(); // byte count of the file-content
byte[] filecontent = new byte[(int) byteLength];
fileInputStream.read(filecontent, 0, (int) byteLength);
Simplest Way for reading bytes from file
import java.io.*;
class ReadBytesFromFile {
public static void main(String args[]) throws Exception {
// getBytes from anyWhere
// I'm getting byte array from File
File file = null;
FileInputStream fileStream = new FileInputStream(file = new File("ByteArrayInputStreamClass.java"));
// Instantiate array
byte[] arr = new byte[(int) file.length()];
// read All bytes of File stream
fileStream.read(arr, 0, arr.length);
for (int X : arr) {
System.out.print((char) X);
}
}
}
Guava has Files.toByteArray() to offer you. It has several advantages:
It covers the corner case where files report a length of 0 but still have content
It's highly optimized, you get a OutOfMemoryException if trying to read in a big file before even trying to load the file. (Through clever use of file.length())
You don't have to reinvent the wheel.
import java.io.File;
import java.nio.file.Files;
import java.nio.file.Path;
File file = getYourFile();
Path path = file.toPath();
byte[] data = Files.readAllBytes(path);
Using the same approach as the community wiki answer, but cleaner and compiling out of the box (preferred approach if you don't want to import Apache Commons libs, e.g. on Android):
public static byte[] getFileBytes(File file) throws IOException {
ByteArrayOutputStream ous = null;
InputStream ios = null;
try {
byte[] buffer = new byte[4096];
ous = new ByteArrayOutputStream();
ios = new FileInputStream(file);
int read = 0;
while ((read = ios.read(buffer)) != -1)
ous.write(buffer, 0, read);
} finally {
try {
if (ous != null)
ous.close();
} catch (IOException e) {
// swallow, since not that important
}
try {
if (ios != null)
ios.close();
} catch (IOException e) {
// swallow, since not that important
}
}
return ous.toByteArray();
}
This is one of the simplest way
String pathFile = "/path/to/file";
byte[] bytes = Files.readAllBytes(Paths.get(pathFile ));
I belive this is the easiest way:
org.apache.commons.io.FileUtils.readFileToByteArray(file);
ReadFully Reads b.length bytes from this file into the byte array, starting at the current file pointer. This method reads repeatedly from the file until the requested number of bytes are read. This method blocks until the requested number of bytes are read, the end of the stream is detected, or an exception is thrown.
RandomAccessFile f = new RandomAccessFile(fileName, "r");
byte[] b = new byte[(int)f.length()];
f.readFully(b);
If you want to read bytes into a pre-allocated byte buffer, this answer may help.
Your first guess would probably be to use InputStream read(byte[]). However, this method has a flaw that makes it unreasonably hard to use: there is no guarantee that the array will actually be completely filled, even if no EOF is encountered.
Instead, take a look at DataInputStream readFully(byte[]). This is a wrapper for input streams, and does not have the above mentioned issue. Additionally, this method throws when EOF is encountered. Much nicer.
Not only does the following way convert a java.io.File to a byte[], I also found it to be the fastest way to read in a file, when testing many different Java file reading methods against each other:
java.nio.file.Files.readAllBytes()
import java.io.File;
import java.io.IOException;
import java.nio.file.Files;
public class ReadFile_Files_ReadAllBytes {
public static void main(String [] pArgs) throws IOException {
String fileName = "c:\\temp\\sample-10KB.txt";
File file = new File(fileName);
byte [] fileBytes = Files.readAllBytes(file.toPath());
char singleChar;
for(byte b : fileBytes) {
singleChar = (char) b;
System.out.print(singleChar);
}
}
}
//The file that you wanna convert into byte[]
File file=new File("/storage/0CE2-EA3D/DCIM/Camera/VID_20190822_205931.mp4");
FileInputStream fileInputStream=new FileInputStream(file);
byte[] data=new byte[(int) file.length()];
BufferedInputStream bufferedInputStream=new BufferedInputStream(fileInputStream);
bufferedInputStream.read(data,0,data.length);
//Now the bytes of the file are contain in the "byte[] data"
Let me add another solution without using third-party libraries. It re-uses an exception handling pattern that was proposed by Scott (link). And I moved the ugly part into a separate message (I would hide in some FileUtils class ;) )
public void someMethod() {
final byte[] buffer = read(new File("test.txt"));
}
private byte[] read(final File file) {
if (file.isDirectory())
throw new RuntimeException("Unsupported operation, file "
+ file.getAbsolutePath() + " is a directory");
if (file.length() > Integer.MAX_VALUE)
throw new RuntimeException("Unsupported operation, file "
+ file.getAbsolutePath() + " is too big");
Throwable pending = null;
FileInputStream in = null;
final byte buffer[] = new byte[(int) file.length()];
try {
in = new FileInputStream(file);
in.read(buffer);
} catch (Exception e) {
pending = new RuntimeException("Exception occured on reading file "
+ file.getAbsolutePath(), e);
} finally {
if (in != null) {
try {
in.close();
} catch (Exception e) {
if (pending == null) {
pending = new RuntimeException(
"Exception occured on closing file"
+ file.getAbsolutePath(), e);
}
}
}
if (pending != null) {
throw new RuntimeException(pending);
}
}
return buffer;
}
public static byte[] readBytes(InputStream inputStream) throws IOException {
byte[] buffer = new byte[32 * 1024];
int bufferSize = 0;
for (;;) {
int read = inputStream.read(buffer, bufferSize, buffer.length - bufferSize);
if (read == -1) {
return Arrays.copyOf(buffer, bufferSize);
}
bufferSize += read;
if (bufferSize == buffer.length) {
buffer = Arrays.copyOf(buffer, bufferSize * 2);
}
}
}
Another Way for reading bytes from file
Reader reader = null;
try {
reader = new FileReader(file);
char buf[] = new char[8192];
int len;
StringBuilder s = new StringBuilder();
while ((len = reader.read(buf)) >= 0) {
s.append(buf, 0, len);
byte[] byteArray = s.toString().getBytes();
}
} catch(FileNotFoundException ex) {
} catch(IOException e) {
}
finally {
if (reader != null) {
reader.close();
}
}
Try this :
import sun.misc.IOUtils;
import java.io.IOException;
try {
String path="";
InputStream inputStream=new FileInputStream(path);
byte[] data=IOUtils.readFully(inputStream,-1,false);
}
catch (IOException e) {
System.out.println(e);
}
Can be done as simple as this (Kotlin version)
val byteArray = File(path).inputStream().readBytes()
EDIT:
I've read docs of readBytes method. It says:
Reads this stream completely into a byte array.
Note: It is the caller's responsibility to close this stream.
So to be able to close the stream, while keeping everything clean, use the following code:
val byteArray = File(path).inputStream().use { it.readBytes() }
Thanks to #user2768856 for pointing this out.
try this if you have target version less than 26 API
private static byte[] readFileToBytes(String filePath) {
File file = new File(filePath);
byte[] bytes = new byte[(int) file.length()];
// funny, if can use Java 7, please uses Files.readAllBytes(path)
try(FileInputStream fis = new FileInputStream(file)){
fis.read(bytes);
return bytes;
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
return null;
}
In JDK8
Stream<String> lines = Files.lines(path);
String data = lines.collect(Collectors.joining("\n"));
lines.close();
Hi I have a script that downloads are file from the web and while doing so prints out the progress. The problem is that the line that prints out the progress slows the program down alot, is there any way to stop this?
import java.io.FileOutputStream;
import java.io.InputStream;
import java.net.URL;
public class download {
public static void main(String[] args) {
try{
URL u = new URL("http://upload.wikimedia.org/wikipedia/commons/1/16/Appearance_of_sky_for_weather_forecast,_Dhaka,_Bangladesh.JPG");
FileOutputStream fos = new FileOutputStream("C://Users/xxx/Desktop/test.jpg");
InputStream is = u.openStream();
long size = u.openConnection().getContentLengthLong();
int data;
long done = 0;
while((data = is.read())!=-1){
double progress = (double) (done)/(double)(size)*100;
System.out.println(progress); // if we dont do this then then it completes fast
fos.write(data);
done++;
}
fos.close();
}catch(Exception e){
e.printStackTrace();
}
}
}
First of all, every I/O operation takes a high cost. Now, you're printing a message for every byte read! (noted in InputStream#read).
If you want/need to print the progress, do it for a bunch of KBs read, usually every 4 KBs. You can do this by using a byte[] buffer to read and write the data from the streams.
BufferedInputStream input = null;
BufferedOutStream output = null;
final int DEFAULT_BUFFER_SIZE = 4 * 1024;
try {
input = new BufferedInputStream(is, DEFAULT_BUFFER_SIZE);
output = new BufferedOutputStream(fos, DEFAULT_BUFFER_SIZE);
byte[] buffer = new byte[DEFAULT_BUFFER_SIZE];
int length;
while ((length = input.read(buffer)) > 0) {
output.write(buffer, 0, length);
done += length;
double progress = (double) (done)/(double)(size)*100
System.out.println(progress);
}
} catch (IOException e) {
//log your exceptions...
} finally {
closeResource(output);
closeResource(input);
}
And have this closeResource method:
public void closeResource(Closeable resource) {
if (resource != null) {
try {
resource.close();
} catch (IOException e) {
logger.error("Error while closing the resource.", e);
}
}
}
Try only printing out every xth loop.
if(done % 10 == 0) System.out.println(progress);
You can print the line only if (done % 100 == 0) let's say.
Also, you can use buffered way of reading, that would speed the program up.
Suggestion: don't print the progress with every iteration of the loop. Use a counter, decide on a reasonable frequency, a number to mod the counter by, and print the progress at that selected frequency.
I've implemented a Huffman coding in java, that works on byte data from an input file. However, it only works when compressing ascii. I'd like to extend it so that it can deal with characters that are larger than 1 byte long, but I'm not sure how to do this exactly.
private static final int CHARS = 256;
private int [] getByteFrequency(File f) throws FileNotFoundException {
try {
FileInputStream fis = new FileInputStream(f);
byte [] bb = new byte[(int) f.length()];
int [] aa = new int[CHARS];
if(fis.read(bb) == bb.length) {
System.out.print("Uncompressed data: ");
for(int i = 0; i < bb.length; i++) {
System.out.print((char) bb[i]);
aa[bb[i]]++;
}
System.out.println();
}
return aa;
} catch (FileNotFoundException e) { throw new FileNotFoundException();
} catch (IOException e) { e.printStackTrace(); }
return null;
}
For example, this is what I'm using to get the frequency of the characters in the file, and obviously it only works on a single byte. If I give it a unicode file, I get an ArrayIndexOutOfBoundsException at aa[bb[i]]++;, and i is normally a negative number. I know this is because aa[bb[i]]++; is only looking at one byte, and the unicode character will be more than one, but I'm not sure on how I can change it.
Can anybody give me some pointers?
Try the following:
private static final int CHARS = 256;
private int [] getByteFrequency(File f) throws FileNotFoundException {
try {
FileInputStream fis = new FileInputStream(f);
byte [] bb = new byte[(int) f.length()];
int [] aa = new int[CHARS];
if(fis.read(bb) == bb.length) {
System.out.print("Uncompressed data: ");
for(int i = 0; i < bb.length; i++) {
System.out.print((char) bb[i]);
aa[((int)bb[i])&0xff]++;
}
System.out.println();
}
return aa;
} catch (FileNotFoundException e) { throw new FileNotFoundException();
} catch (IOException e) { e.printStackTrace(); }
return null;
}
If i'm correct (I haven't tested it), your problem is that byte is a SIGNED value in java. The cast to integer + masking it to 0xff should handle it correctly.
I have an object with 1 int and 4 doubles.
I compared the performance to write 5 million of these objects in a file using serialization and FileChannel object.
In the serialization used the following method to read and write the file.
public void print() throws IOException, ClassNotFoundException{
ObjectInputStream input = new ObjectInputStream(new FileInputStream(this.filePath) );
try {
while(true) {
this.sb = (Sbit) input.readObject();
//System.out.println(this.sb.toString());
}
}
catch ( EOFException eofException ) {
return;
}
catch (IOException ioException) {
System.exit( 1 );
}
finally {
if( input != null )
input.close();
}
}
public void build() throws IOException {
ObjectOutputStream output = new ObjectOutputStream( new FileOutputStream(this.filePath) );
try {
Random random = new Random();
for (int i = 0; i<5000000; i++) {
this.sb = new Sbit();
this.sb.setKey(i);
this.sb.setXMin( random.nextDouble() );
this.sb.setXMax( random.nextDouble() );
this.sb.setYMin( random.nextDouble() );
this.sb.setYMax( random.nextDouble() );
output.writeObject(this.sb);
}
}
catch (IOException ioException) {
System.exit( 1 );
}
finally {
try {
if( output != null)
output.close();
}
catch ( Exception exception ) {
exception.printStackTrace();
System.exit(1);
}
}
}
While using java.nio was:
public void print() throws IOException {
FileChannel file = new RandomAccessFile(this.filePath, "rw").getChannel();
ByteBuffer[] buffers = new ByteBuffer[5];
buffers[0] = ByteBuffer.allocate(4); // 4 bytes to int
buffers[1] = ByteBuffer.allocate(8); // 8 bytes to double
buffers[2] = ByteBuffer.allocate(8);
buffers[3] = ByteBuffer.allocate(8);
buffers[4] = ByteBuffer.allocate(8);
while (true) {
if(file.read(buffers[0]) == -1 ) // Read the int,
break; // if its EOF exit the loop
buffers[0].flip();
this.sb = new Sbit();
this.sb.setKey(buffers[0].getInt());
if(file.read(buffers[1]) == -1) { // Read the int primary value
assert false; // Should not get here!
break; // Exit loop on EOF
}
buffers[1].flip();
this.sb.setXMin( buffers[1].getDouble() );
if(file.read(buffers[2]) == -1) {
assert false;
break;
}
buffers[2].flip();
this.sb.setXMax( buffers[2].getDouble() );
if(file.read(buffers[3]) == -1) {
assert false;
break;
}
buffers[3].flip();
this.sb.setYMin( buffers[3].getDouble() );
if(file.read(buffers[4]) == -1) {
assert false;
break;
}
buffers[4].flip();
this.sb.setYMax( buffers[4].getDouble() );
for(int i = 0; i < 5; i++)
buffers[i].clear();
}
}
public void build() throws IOException {
FileChannel file = new RandomAccessFile(this.filePath, "rw").getChannel();
Random random = new Random();
for (int i = 0; i<5000000; i++) {
this.sb = new Sbit();
this.sb.setKey(i);
this.sb.setXMin( random.nextDouble() );
this.sb.setXMax( random.nextDouble() );
this.sb.setYMin( random.nextDouble() );
this.sb.setYMax( random.nextDouble() );
ByteBuffer[] buffers = new ByteBuffer[5];
buffers[0] = ByteBuffer.allocate(4); // 4 bytes to into
buffers[1] = ByteBuffer.allocate(8); // 8 bytes to double
buffers[2] = ByteBuffer.allocate(8);
buffers[3] = ByteBuffer.allocate(8);
buffers[4] = ByteBuffer.allocate(8);
buffers[0].putInt(this.sb.getKey()).flip();
buffers[1].putDouble(this.sb.getXMin()).flip();
buffers[2].putDouble(this.sb.getXMax()).flip();
buffers[3].putDouble(this.sb.getYMin()).flip();
buffers[4].putDouble(this.sb.getYMax()).flip();
try {
file.write(buffers);
}
catch (IOException e) {
e.printStackTrace(System.err);
System.exit(1);
}
for(int x = 0; x < 5; x++)
buffers[x].clear();
}
}
But I read a lot about on the java.nio and tried to use it precisely because it has better performance. But that's not what happened in my case.
To write the file were the following (java.nio):
file size: 175 MB
time in milliseconds: 57638
Using serialization:
file size: 200 MB
time in milliseconds: 34504
For the reading of this file, were as follows (java.nio):
time in milliseconds: 78172
Using serialization:
time in milliseconds: 35288
Am I doing something wrong in java.nio? I would like to write to the same binary files as done. There is another way to write file efficiently? actually serializing an object is the best way?
Thank you.
You are creating 25,000,000 ByteBuffer objects, with each ByteBuffer being at most 8 bytes. Thats very inefficient.
Create just one ByteBuffer by allocating it to 38 bytes outside the loop (before the for statement)
Inside the loop you can use the same ByteBuffer as follows:
buffer.clear();
buffer.putInt(this.sb.getKey());
buffer.putDouble(this.sb.getXMin());
buffer.putDouble(this.sb.getXMax());
buffer.putDouble(this.sb.getYMin());
buffer.putDouble(this.sb.getYMax());
buffer.flip();
try
{
file.write(buffer);
}
catch (IOException ex)
{
ex.printStackTrace();
//etc...
}
buffer.flip();
Try it out and let us know if you see any improvements.
Instead of using multiple ByteBuffers, declare a single byte buffer that is large enough to hold all of the data you want to put into it. Then put data into it just like you are now. When done, flip the buffer and write it out. When you are ready to read it back in, read the data from disk into the byte buffer, flip it, and then read the data out using getInt/getDouble.
I haven't tried to serialize stuff on my own, but have achieved good results with kryo. It is a lot faster than standard Java serialization.