I need to read multiple small files and append them into a bigger single file.
Base64OutputStream baos = new Base64OutputStream(new FileOutputStream(outputFile, true));
for (String fileLocation : fileLocations) {
InputStream fis = null;
try
{
fis = new FileInputStream(new File(fileLocation));
int bytesRead = 0;
byte[] buf = new byte[65536];
while ((bytesRead=fis.read(buf)) != -1) {
if (bytesRead > 0) baos.write(buf, 0, bytesRead);
}
}
catch (Exception e) {
logger.error(e.getMessage());
}
finally{
try{
if(fis != null)
fis.close();
}
catch(Exception e){
logger.error(e.getMessage());
}
}
}
All pretty standard, but I'm finding that, unless I open a new baos per input file (include it inside the loop), all the files following the first one written by baos are wrong (incorrect output).
The questions:
I've been told that opening/closing an outputstream back and forth for the same resource is not a good practice, why?
Why using a single output stream is not delivering the same result as multiple separate ones?
Perhaps the problem is that if you are assumming that encoding in base64 the concatenation of several files should give the same result as concatenating the base64 encoding of each file? That's not necessariy the case; base64 encodes groups of three consecutive input bytes to 4 ascii characters, so, unless you know that each file has a size that is a multiple of three, the base64 encoding will produce completely different outputs.
Related
we were given a few exercises in lab and one of these is to convert the file transferring method from FileInputStream to BufferedInputStream. It's a client sending a GET request to a web server, which sends the file requested.
I came up with a simple solution, and I just wanted to check if it's correct.
Original code:
try {
FileInputStream fis = new FileInputStream(req);
// req, String containing file name
byte[] data = new byte [fis.available()];
fis.read(data);
out.write(data); // OutputStream out = socket.getOutputStream();
} catch (FileNotFoundException e){
new PrintStream(out).println("404 Not Found");
}
My try:
try {
BufferedInputStream bis = new BufferedInputStream (new FileInputStream(req));
byte[] data = new byte[4];
while(bis.read(data) > -1) {
out.write(data);
data = new byte[4];
}
} catch (FileNotFoundException e){
new PrintStream(out).println("404 Not Found");
}
The file is a web page named index.html, which contains a simple html page.
I have to reallocate the array every time, because at the last execution of the while loop, if the file isn't a multiple of 4 in size, the data array will contain characters from the previous execution, which are shown in the browser.
I chose 4 as data size for debugging purposes.
Output is correct.
Is this a good solution or can I do better?
There's no need to re-create the byte array each time - just overwrite it. More importantly though, you have a conceptual mistake inside your loop. Each iteration just writes the array to the stream assuming it's all valid. If you examine BufferedInputStream#read's documentation you'll see it may not read enough data to fill the entire array, and will return the number of bytes it actually read. You should use this number to limit the amount of bytes you're writing:
while((int len = bis.read(data)) > -1) {
out.write(data, 0, len);
}
I suggest you close off your file once you are done. The BufferedInputStream uses an 8 KB buffer by default which you are reducing to a smaller buffer. A simpler solution is to copy 8 KB at a time and not use the added buffer
try (InputStream in = new FileInputStream(req)) {
byte[] data = new byte[8 << 10];
for (int len; (len = bis.read(data)) > -1; )
out.write(data, 0, len);
} catch (IOException e) {
out.write("404 Not Found\n".getBytes());
}
My function returns a string which is out of its limit, because of the large file size I am using.
Is there a way to create a function that returns a string array so that later on I can cascade them and recreate the file ?
private String ConvertVideoToBase64()
{
ByteArrayOutputStream baos = new ByteArrayOutputStream();
FileInputStream fis;
try {
File inputFile = new File("/storage/emulated/0/Videos/out.mp4");
fis = new FileInputStream(inputFile);
byte[] buf = new byte[1024];
int n;
while (-1 != (n = fis.read(buf)))
baos.write(buf, 0, n);
byte[] videoBytes = baos.toByteArray();
fis.close();
return Base64.encodeToString(videoBytes, Base64.DEFAULT);
//imageString = videoString;
} catch (IOException e1) {
// TODO Auto-generated catch block
e1.printStackTrace();
}
}
The entire movie probably dooesn't fit in RAM at once, which is what you are trying to do with your baos object.
Try rewriting your code in such a way as to encode each 1024-byte chunk, and then write to a file / send over the network / whatever.
Edit: I think you need to use a streaming approach. This is common on platforms where you can't / don't want to hold all the data in memory at once.
The basic algorithm will be:
Open your file. This is an input stream.
Connect to your server. This is your output stream
While the file has data
Read some amount of bytes, say 1024, from the file into a buffer.
encode these bytes into a Base64 string
write the string to the server
Close server connection
Close file
You have the input stream side. I'll presume you have some web service you are POSTing to. Have a look at http://developer.android.com/training/basics/network-ops/connecting.html to get started with the output stream.
I am creating a program that will extract a zip and then insert the files into a database, every so often I get the error
java.lang.Exception: java.io.EOFException: Unexpected end of ZLIB input stream
I can not pinpoint the reason for this as the extraction code is pretty much the same as all the other code you can find on the web. My code is as follows:
public void extract(String zipName, InputStream content) throws Exception {
int BUFFER = 2048;
//create the zipinputstream
ZipInputStream zis = new ZipInputStream(content);
//Get the name of the zip
String containerName = zipName;
//container for the zip entry
ZipEntry entry;
// Process each entry
while ((entry = zis.getNextEntry()) != null) {
//get the entry file name
String currentEntry = entry.getName();
try {
ByteArrayOutputStream baos = new ByteArrayOutputStream();
// establish buffer for writing file
byte data[] = new byte[BUFFER];
int currentByte;
// read and write until last byte is encountered
while ((currentByte = zis.read(data, 0, BUFFER)) != -1) {
baos.write(data, 0, currentByte);
}
baos.flush(); //flush the buffer
//this method inserts the file into the database
insertZipEntry(baos.toByteArray());
baos.close();
}
catch (Exception e) {
System.out.println("ERROR WITHIN ZIP " + containerName);
}
}
}
This is probably caused by this JVM bug (JVM-6519463)
I previously has about one or two errors on 1000 randomly created documents, I applied the proposed solution (catch EOFException and do nothing with it) and I have no more errors.
I would say you are occasionally being given truncated Zip files to process. Check upstream.
I had the same exception and the problem was in the compressing method (not extracting). I did not close the ZipOutputStream with zos.closeEntry() after writing to the output stream. Without that, compressing worked well but I got an exception while extracting.
public static byte[] zip(String outputFilename, byte[] output) {
try (ByteArrayOutputStream baos = new ByteArrayOutputStream();
ZipOutputStream zos = new ZipOutputStream(baos)) {
zos.putNextEntry(new ZipEntry(outputFilename));
zos.write(output, 0, output.length);
zos.closeEntry(); //this line must be here
return baos.toByteArray();
} catch (IOException e) {
//catch exception
}
}
Never attempt to read more bytes than the entry contains. Call ZipEntry.getSize() to get the actual size of the entry, then use this value to keep track of the number of bytes remaining in the entry while reading from it. See below :
try{
...
int bytesLeft = (int)entry.getSize();
while ( bytesLeft>0 && (currentByte=zis.read(data, 0, Math.min(BUFFER, bytesLeft))) != -1) {
...
}
...
}
I'm relatively new to Java and I'm attempting to write a simple android app. I have a large text file with about 3500 lines in the assets folder of my applications and I need to read it into a string. I found a good example about how to do this but I have a question about why the byte array is initialized to 1024. Wouldn't I want to initialize it to the length of my text file? Also, wouldn't I want to use char, not byte? Here is the code:
private void populateArray(){
AssetManager assetManager = getAssets();
InputStream inputStream = null;
try {
inputStream = assetManager.open("3500LineTextFile.txt");
} catch (IOException e) {
Log.e("IOException populateArray", e.getMessage());
}
String s = readTextFile(inputStream);
// Add more code here to populate array from string
}
private String readTextFile(InputStream inputStream) {
ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
inputStream.length
byte buf[] = new byte[1024];
int len;
try {
while ((len = inputStream.read(buf)) != -1) {
outputStream.write(buf, 0, len);
}
outputStream.close();
inputStream.close();
} catch (IOException e) {
Log.e("IOException readTextFile", e.getMessage());
}
return outputStream.toString();
}
EDIT: Based on your suggestions, I tried this approach. Is it any better? Thanks.
private void populateArray(){
AssetManager assetManager = getAssets();
InputStream inputStream = null;
Reader iStreamReader = null;
try {
inputStream = assetManager.open("List.txt");
iStreamReader = new InputStreamReader(inputStream, "UTF-8");
} catch (IOException e) {
Log.e("IOException populateArray", e.getMessage());
}
String String = readTextFile(iStreamReader);
// more code here
}
private String readTextFile(InputStreamReader inputStreamReader) {
StringBuilder sb = new StringBuilder();
char buf[] = new char[2048];
int read;
try {
do {
read = inputStreamReader.read(buf, 0, buf.length);
if (read>0) {
sb.append(buf, 0, read);
}
} while (read>=0);
} catch (IOException e) {
Log.e("IOException readTextFile", e.getMessage());
}
return sb.toString();
}
This example is not good at all. It's full of bad practices (hiding exceptions, not closing streams in finally blocks, not specify an explicit encoding, etc.). It uses a 1024 bytes long buffer because it doesn't have any way of knowing the length of the input stream.
Read the Java IO tutorial to learn how to read text from a file.
You are reading the file into a buffer of 1024 Bytes.
Then those 1024 bytes are written to outputStream.
This process repeats until the whole file is read into the outputStream.
As JB Nizet mentioned the example is full of bad practices.
Wouldn't I want to initialize it to the length of my text file? Also, wouldn't I want to use char, not byte?
Yes, and yes ... and as other answers have said, you've picked an example with a number of errors in it.
However, there is a theoretical problem doing both; i.e. setting the buffer length to the file length and using a character buffer rather than a byte buffer. The problem is that the file size is measured in bytes, but the size of the buffer needs to be measured in characters. This is normally fine, but it is theoretically possible that you will need more characters than the file size in bytes; e.g. if the input file used a 6 bit character set and packed 4 characters into 3 bytes.
To read from a file I usaully use a Scanner and a StringBuilder.
Scanner scan = new Scanner(new BufferedInputStream(new FileInputStream(filename)), "UTF-8");
StringBuilder sb = new StringBuilder();
while (scan.hasNextLine()) {
sb.append(scan.nextLine());
sb.append("\n");
}
scan.close
return sb.toString();
Try to throw your exceptions instead of swallowing them. The caller must know there was a problem reading your file.
Edit: Also note that using a BufferedInputStream is important. Otherwise it will try to read bytes by bytes which can be slow.
I have a Java class, where I'm reading data in via an InputStream
byte[] b = null;
try {
b = new byte[in.available()];
in.read(b);
} catch (IOException e) {
e.printStackTrace();
}
It works perfectly when I run my app from the IDE (Eclipse).
But when I export my project and it's packed in a JAR, the read command doesn't read all the data. How could I fix it?
This problem mostly occurs when the InputStream is a File (~10kb).
Thanks!
Usually I prefer using a fixed size buffer when reading from input stream. As evilone pointed out, using available() as buffer size might not be a good idea because, say, if you are reading a remote resource, then you might not know the available bytes in advance. You can read the javadoc of InputStream to get more insight.
Here is the code snippet I usually use for reading input stream:
byte[] buffer = new byte[BUFFER_SIZE];
int bytesRead = 0;
while ((bytesRead = in.read(buffer)) >= 0){
for (int i = 0; i < bytesRead; i++){
//Do whatever you need with the bytes here
}
}
The version of read() I'm using here will fill the given buffer as much as possible and
return number of bytes actually read. This means there is chance that your buffer may contain trailing garbage data, so it is very important to use bytes only up to bytesRead.
Note the line (bytesRead = in.read(buffer)) >= 0, there is nothing in the InputStream spec saying that read() cannot read 0 bytes. You may need to handle the case when read() reads 0 bytes as special case depending on your case. For local file I never experienced such case; however, when reading remote resources, I actually seen read() reads 0 bytes constantly resulting the above code into an infinite loop. I solved the infinite loop problem by counting the number of times I read 0 bytes, when the counter exceed a threshold I will throw exception. You may not encounter this problem, but just keep this in mind :)
I probably will stay away from creating new byte array for each read for performance reasons.
read() will return -1 when the InputStream is depleted. There is also a version of read which takes an array, this allows you to do chunked reads. It returns the number of bytes actually read or -1 when at the end of the InputStream. Combine this with a dynamic buffer such as ByteArrayOutputStream to get the following:
InputStream in = ...
ByteArrayOutputStream buffer = new ByteArrayOutputStream();
int read;
byte[] input = new byte[4096];
while ( -1 != ( read = in.read( input ) ) ) {
buffer.write( input, 0, read );
}
input = buffer.toByteArray()
This cuts down a lot on the number of methods you have to invoke and allows the ByteArrayOutputStream to grow its internal buffer faster.
File file = new File("/path/to/file");
try {
InputStream is = new FileInputStream(file);
byte[] bytes = IOUtils.toByteArray(is);
System.out.println("Byte array size: " + bytes.length);
} catch (IOException e) {
e.printStackTrace();
}
Below is a snippet of code that downloads a file (*. Png, *. Jpeg, *. Gif, ...) and write it in BufferedOutputStream that represents the HttpServletResponse.
BufferedInputStream inputStream = bo.getBufferedInputStream(imageFile);
try {
ByteArrayOutputStream buffer = new ByteArrayOutputStream();
int bytesRead = 0;
byte[] input = new byte[DefaultBufferSizeIndicator.getDefaultBufferSize()];
while (-1 != (bytesRead = inputStream.read(input))) {
buffer.write(input, 0, bytesRead);
}
input = buffer.toByteArray();
response.reset();
response.setBufferSize(DefaultBufferSizeIndicator.getDefaultBufferSize());
response.setContentType(mimeType);
// Here's the secret. Content-Length should equal the number of bytes read.
response.setHeader("Content-Length", String.valueOf(buffer.size()));
response.setHeader("Content-Disposition", "inline; filename=\"" + imageFile.getName() + "\"");
BufferedOutputStream outputStream = new BufferedOutputStream(response.getOutputStream(), DefaultBufferSizeIndicator.getDefaultBufferSize());
try {
outputStream.write(input, 0, buffer.size());
} finally {
ImageBO.close(outputStream);
}
} finally {
ImageBO.close(inputStream);
}
Hope this helps.