Using Base64 from Apache commons
public byte[] encode(File file) throws FileNotFoundException, IOException {
byte[] encoded;
try (FileInputStream fin = new FileInputStream(file)) {
byte fileContent[] = new byte[(int) file.length()];
fin.read(fileContent);
encoded = Base64.encodeBase64(fileContent);
}
return encoded;
}
Exception in thread "AWT-EventQueue-0" java.lang.OutOfMemoryError: Java heap space
at org.apache.commons.codec.binary.BaseNCodec.encode(BaseNCodec.java:342)
at org.apache.commons.codec.binary.Base64.encodeBase64(Base64.java:657)
at org.apache.commons.codec.binary.Base64.encodeBase64(Base64.java:622)
at org.apache.commons.codec.binary.Base64.encodeBase64(Base64.java:604)
I'm making small app for mobile device.
You cannot just load the whole file into memory, like here:
byte fileContent[] = new byte[(int) file.length()];
fin.read(fileContent);
Instead load the file chunk by chunk and encode it in parts. Base64 is a simple encoding, it is enough to load 3 bytes and encode them at a time (this will produce 4 bytes after encoding). For performance reasons consider loading multiples of 3 bytes, e.g. 3000 bytes - should be just fine. Also consider buffering input file.
An example:
byte fileContent[] = new byte[3000];
try (FileInputStream fin = new FileInputStream(file)) {
while(fin.read(fileContent) >= 0) {
Base64.encodeBase64(fileContent);
}
}
Note that you cannot simply append results of Base64.encodeBase64() to encoded bbyte array. Actually, it is not loading the file but encoding it to Base64 causing the out-of-memory problem. This is understandable because Base64 version is bigger (and you already have a file occupying a lot of memory).
Consider changing your method to:
public void encode(File file, OutputStream base64OutputStream)
and sending Base64-encoded data directly to the base64OutputStream rather than returning it.
UPDATE: Thanks to #StephenC I developed much easier version:
public void encode(File file, OutputStream base64OutputStream) {
InputStream is = new FileInputStream(file);
OutputStream out = new Base64OutputStream(base64OutputStream)
IOUtils.copy(is, out);
is.close();
out.close();
}
It uses Base64OutputStream that translates input to Base64 on-the-fly and IOUtils class from Apache Commons IO.
Note: you must close the FileInputStream and Base64OutputStream explicitly to print = if required but buffering is handled by IOUtils.copy().
Either the file is too big, or your heap is too small, or you've got a memory leak.
If this only happens with really big files, put something into your code to check the file size and reject files that are unreasonably big.
If this happens with small files, increase your heap size by using the -Xmx command line option when you launch the JVM. (If this is in a web container or some other framework, check the documentation on how to do it.)
If the file recurs, especially with small files, the chances are that you've got a memory leak.
The other point that should be made is that your current approach entails holding two complete copies of the file in memory. You should be able to reduce the memory usage, though you'll typically need a stream-based Base64 encoder to do this. (It depends on which flavor of the base64 encoding you are using ...)
This page describes a stream-based Base64 encoder / decoder library, and includes lnks to some alternatives.
Well, do not do it for the whole file at once.
Base64 works on 3 bytes at a time, so you can read your file in batches of "multiple of 3" bytes, encode them and repeat until you finish the file:
// the base64 encoding - acceptable estimation of encoded size
StringBuilder sb = new StringBuilder(file.length() / 3 * 4);
FileInputStream fin = null;
try {
fin = new FileInputStream("some.file");
// Max size of buffer
int bSize = 3 * 512;
// Buffer
byte[] buf = new byte[bSize];
// Actual size of buffer
int len = 0;
while((len = fin.read(buf)) != -1) {
byte[] encoded = Base64.encodeBase64(buf);
// Although you might want to write the encoded bytes to another
// stream, otherwise you'll run into the same problem again.
sb.append(new String(buf, 0, len));
}
} catch(IOException e) {
if(null != fin) {
fin.close();
}
}
String base64EncodedFile = sb.toString();
You are not reading the whole file, just the first few kb. The read method returns how many bytes were actually read. You should call read in a loop until it returns -1 to be sure that you have read everything.
The file is too big for both it and its base64 encoding to fit in memory. Either
process the file in smaller pieces or
increase the memory available to the JVM with the -Xmx switch, e.g.
java -Xmx1024M YourProgram
This is best code to upload image of more size
bitmap=Bitmap.createScaledBitmap(bitmap, 100, 100, true);
ByteArrayOutputStream stream = new ByteArrayOutputStream();
bitmap.compress(Bitmap.CompressFormat.PNG, 100, stream); //compress to which format you want.
byte [] byte_arr = stream.toByteArray();
String image_str = Base64.encodeBytes(byte_arr);
Well, looks like your file is too large to keep the multiple copies necessary for an in-memory Base64 encoding in the available heap memory at the same time. Given that this is for a mobile device, it's probably not possible to increase the heap, so you have two options:
make the file smaller (much smaller)
Do it in a stram-based way so that you're reading from an InputStream one small part of the file at a time, encode it and write it to an OutputStream, without ever keeping the enitre file in memory.
In Manifest in applcation tag write following
android:largeHeap="true"
It worked for me
Java 8 added Base64 methods, so Apache Commons is no longer needed to encode large files.
public static void encodeFileToBase64(String inputFile, String outputFile) {
try (OutputStream out = Base64.getEncoder().wrap(new FileOutputStream(outputFile))) {
Files.copy(Paths.get(inputFile), out);
} catch (IOException e) {
throw new UncheckedIOException(e);
}
}
Related
I am processing very large files (> 2Gig). Each input file is Base64 encoded, andI am outputting to new files after decoding. Depending on the buffer size (LARGE_BUF) and for a given input file, my input to output conversion either works fine, is missing one or more bytes, or throws an exception at the outputStream.write line (IllegalArgumentException: Last unit does not have enough bits). Here is the code snippet (could not cut and paste so my not be perfect):
.
.
final int LARGE_BUF = 1024;
byte[] inBuf = new byte[LARGE_BUF];
try(InputStream inputStream = new FileInputStream(inFile); OutputStream outStream new new FileOutputStream(outFile)) {
for(int len; (len = inputStream.read(inBuf)) > 0); ) {
String out = new String(inBuf, 0, len);
outStream.write(Base64.getMimeDecoder().decode(out.getBytes()));
}
}
For instance, for my sample input file, if LARGE_BUF is 1024, output file is 4 bytes too small, if 2*1024, I get the exception mentioned above, if 7*1024, it works correctly. Grateful for any ideas. Thank you.
First, you are converting bytes into a String, then immediately back into bytes. So, remove the use of String entirely.
Second, base64 encoding turns each sequence of three bytes into four bytes, so when decoding, you need four bytes to properly decode three bytes of original data. It is not safe to create a new decoder for each arbitrarily read sequence of bytes, which may or may not have a length which is an exact multiple of four.
Finally, Base64.Decoder has a wrap(InputStream) method which makes this considerably easier:
try (InputStream inputStream = Base64.getDecoder().wrap(
new BufferedInputStream(
Files.newInputStream(Paths.get(inFile))))) {
Files.copy(inputStream, Paths.get(outFile));
}
Is there any other way to convert a file (PDF) into byte array other than using FileInputStream or toByteArray(InputStream input)..
Is there any method to convert it directly. I found Files.readAllBytes in java.nio.file.* package but it gives ClassNotFoundException in my RAD. I am having Java 8 JDK in my system.
Is java.nio.file.* package not available in Java 8?
My requirement is that i should not to stream the file using InputStram.
Seems like you're loading the whole content of your file into a byte[] directly in memory, then writing this in the OutputStream. The problem with this approach is that if you load files of 1 or 2 GBs entirely in memory then you will encounter with OutOfMemoryError quickly. To avoid this, you should read the data from InputStream in small chunks and write these chunks in the output stream. Here's an example for file downloading:
BufferedInputStream bis = new BufferedInputStream(
new FileInputStream(new File("/path/to/folder", "file.pdf")));
ServletOutputStream outStream = response.getOutputStream();
//to make it easier to change to 8 or 16 KBs
//make some tests to determine the best performance for your case
int FILE_CHUNK_SIZE = 1024 * 4;
byte[] chunk = new byte[FILE_CHUNK_SIZE];
int bytesRead = 0;
while ((bytesRead = bis.read(chunk)) != -1) {
outStream.write(chunk, 0, bytesRead);
}
bis.close();
outStream.flush();
outStream.close();
I'm trying to make a file hexadecimal converter (input file -> output hex string of the file)
The code I came up with is
static String open2(String path) throws FileNotFoundException, IOException,OutOfMemoryError {
System.out.println("BEGIN LOADING FILE");
StringBuilder sb = new StringBuilder();
//sb.ensureCapacity(2147483648);
int size = 262144;
FileInputStream f = new FileInputStream(path);
FileChannel ch = f.getChannel( );
byte[] barray = new byte[size];
ByteBuffer bb = ByteBuffer.wrap( barray );
while (ch.read(bb) != -1)
{
//System.out.println(sb.capacity());
sb.append(bytesToHex(barray));
bb.clear();
}
System.out.println("FILE LOADED; BRING IT BACK");
return sb.toString();
}
I am sure that "path" is a valid filename.
The problem is with big files (>=
500mb), the compiler outputs a OutOfMemoryError: Java Heap Space on the StringBuilder.append.
To create this code I followed some tips from http://nadeausoftware.com/articles/2008/02/java_tip_how_read_files_quickly but I got a doubt when I tried to force a space allocation for the StringBuilder sb: "2147483648 is too big for an int".
If I want to use this code even with very big files (let's say up to 2gb if I really have to stop somewhere) what's the better way to output a hexadecimal string conversion of the file in terms of speed?
I'm now working on copying the converted string into a file. Anyway I'm having problems of "writing the empty buffer on the file" after the eof of the original one.
static String open3(String path) throws FileNotFoundException, IOException {
System.out.println("BEGIN LOADING FILE (Hope this is the last change)");
FileWriter fos = new FileWriter("HEXTMP");
int size = 262144;
FileInputStream f = new FileInputStream(path);
FileChannel ch = f.getChannel( );
byte[] barray = new byte[size];
ByteBuffer bb = ByteBuffer.wrap( barray );
while (ch.read(bb) != -1)
{
fos.write(bytesToHex(barray));
bb.clear();
}
System.out.println("FILE LOADED; BRING IT BACK");
return "HEXTMP";
}
obviously the file HEXTMP created has a size multiple of 256k, but if the file is 257k it will be a 512 file with LOT of "000000" at the end.
I know I just have to create a last byte array with cut length.
(I used a file writer because i wanted to write the string of hex; otherwise it would have just copied the file as-is)
Why are you loading complete file?
You can load few bytes in buffer from input file, process bytes in buffer, then write processed bytes buffer to output file. Continue this till all bytes from input file are not processed.
FileInputStream fis = new FileInputStream("in file");
FileOutputStream fos = new FileOutputStream("out");
byte buffer [] = new byte[8192];
while(true){
int count = fis.read(buffer);
if(count == -1)
break;
byte[] processed = processBytesToConvert(buffer, count);
fos.write(processed);
}
fis.close();
fos.close();
So just read few bytes in buffer, convert it to hex string, get bytes from converted hex string, then write back these bytes to file, and continue for next few input bytes.
The problem here is that you try to read the whole file and store it in memory.
You should use stream, read some lines of your input file, convert them and write them in the output file. That way your program can scale, whatever the size of the input file is.
The key would be to read file in chunks instead of reading all of it in one go. Depending on its use you could vary size of the chunk. For example, if you are trying to make a hex viewer / editor determine how much content is being shown in the viewport and read only as much of data from file. Or if you are simply converting and dumping hex to another file use any chunk size that is small enough to fit in memory but big enough for performance. This should be tunable over some runs. Perhaps use filesystem NIO in Java 7 so that you can do all three tasks - reading, processing and writing - concurrently. The link included in question gives good primer on reading files.
I have a FileInputStream which has 200MB of data. I have to retrieve the bytes from the input stream.
I'm using the below code to convert InputStream into byte array.
private byte[] convertStreamToByteArray(InputStream inputStream) {
ByteArrayOutputStream bos = new ByteArrayOutputStream();
try {
int i;
while ((i = inputStream.read()) > 0) {
bos.write(i);
}
} catch (IOException e) {
e.printStackTrace();
}
return bos.toByteArray();
}
I'm getting OutOfMemory exception while coverting such a large data to a byte array.
Kindly let me know any possible solutions to convert InputStream to byte array.
Why do you want to hold the 200MB file in memory? What are you going to to with the byte array?
If you are going to write it to an OutputStream, get the OutputStream ready first, then read the InputStream a chunk at a time, writing the chunk to the OutputStream as you go. You'll never store more than the chunk in memory.
eg:
public static void pipe(InputStream is, OutputStream os) throws IOException {
int read = -1;
byte[] buf = new byte[1024];
try {
while( (read = is.read(buf)) != -1) {
os.write(buf, 0, read);
}
}
finally {
is.close();
os.close();
}
}
This code will take two streams and pipe one to the other.
Android application has limited Heap Memory and which depend on devices. Currently most of the new devices has 64 but it could be more or less depend on Manufacturer. I have seen device come come with 128 MB heap Memory.
So what this really mean?
Its simply means that regardless of available physical memory your application is not allowed to grow more then allocated heap size.
From Android API level 11 you can request for additional memory by using manifest tag android:largeHeap="true" which will be double your heap size. That simply means if your devices has 64 you will get 128 and in case of 128 you will get 256. But this will not work for lower API version.
I am not exactly sure what is your requirement, but if you planning to send over HTTP then read file send data and read again. You can follow the same procedure for file IO also. Just to make sure not to use memory more then available heap size. Just to be extra cautious make sure you leave some room for application execution.
Your problem is not about how to convert InputStream to byte array but that the array is to big to fit in memory. You don't have much choice but to find a way to process bytes from InputStream in smaller blocks.
You'll probably need to massively increase the heap size. Try running your Java virtual machine with the -Xms384m -Xmx384m flag (which specifies a starting and maximum heap size of 384 megabytes, unless I'm wrong). See this for an old version of the available options: depending on the specific virtual machine and platform you may need to do some digging around, but -Xms and -Xmx should get you over that hump.
Now, you probably really SHOULDN'T read it into a byte array, but if that's your application, then...
try this code
private byte[] convertStreamToByteArray(InputStream inputStream) {
ByteArrayOutputStream byteOutStream = new ByteArrayOutputStream();
int readByte = 0;
byte[] buffer = new byte[2024];
while(true)
{
readByte = inputStream.read(buffer);
if(readByte == -1)
{
break;
}
byteOutStream.write(buffer);
}
inputStream.close();
byteOutStream.flush();
byteOutStream.close();
byte[] byteArray= byteOutStream.toByteArray();
return byteArray;
}
try to read chunk of data from InputStream .
I am currently analyzing firmware images which contain many different sections, one of which is a GZIP section.
I am able to know the location of the start of the GZIP section using magic number and the GZIPInputStream in Java.
However, I need to know the compressed size of the gzip section. GZIPInputStream would return me the uncompressed file size.
Is there anybody who has an idea?
You can count the number of byte read using a custom InputStream. You would need to force the stream to read one byte at a time to ensure you don't read more than you need.
You can wrap your current InputStream in this
class CountingInputStream extends InputStream {
final InputStream is;
int counter = 0;
public CountingInputStream(InputStream is) {
this.is = is;
}
public int read() throws IOException {
int read = is.read();
if (read >= 0) counter++;
return read;
}
}
and then wrap it in a GZIPInputStream. The field counter will hold the number of bytes read.
To use this with BufferedInputStream you can do
InputStream is = new BufferedInputStream(new FileInputStream(filename));
// read some data or skip to where you want to start.
CountingInputStream cis = new CountingInputStream(is);
GZIPInputStream gzis = new GZIPInputStream(cis);
// read some compressed data
cis.read(...);
int dataRead = cis.counter;
In general, there is no easy way to tell the size of the gzipped data, other than just going through all the blocks.
gzip is a stream compression format, meaning that all the compressed data is written in a single pass. There is no way to stash the compressed size anywhere---it can't be in the header, since that would require more than one pass, and it's useless to have it at the trailer, since if you can locate the trailer, then you already know the compressed size.