Large file to base64 string array

Large file to base64 string array - java

My function returns a string which is out of its limit, because of the large file size I am using.
Is there a way to create a function that returns a string array so that later on I can cascade them and recreate the file ?
private String ConvertVideoToBase64()
{
ByteArrayOutputStream baos = new ByteArrayOutputStream();
FileInputStream fis;
try {
File inputFile = new File("/storage/emulated/0/Videos/out.mp4");
fis = new FileInputStream(inputFile);
byte[] buf = new byte[1024];
int n;
while (-1 != (n = fis.read(buf)))
baos.write(buf, 0, n);
byte[] videoBytes = baos.toByteArray();
fis.close();
return Base64.encodeToString(videoBytes, Base64.DEFAULT);
//imageString = videoString;
} catch (IOException e1) {
// TODO Auto-generated catch block
e1.printStackTrace();
}
}

The entire movie probably dooesn't fit in RAM at once, which is what you are trying to do with your baos object.
Try rewriting your code in such a way as to encode each 1024-byte chunk, and then write to a file / send over the network / whatever.
Edit: I think you need to use a streaming approach. This is common on platforms where you can't / don't want to hold all the data in memory at once.
The basic algorithm will be:
Open your file. This is an input stream.
Connect to your server. This is your output stream
While the file has data
Read some amount of bytes, say 1024, from the file into a buffer.
encode these bytes into a Base64 string
write the string to the server
Close server connection
Close file
You have the input stream side. I'll presume you have some web service you are POSTing to. Have a look at http://developer.android.com/training/basics/network-ops/connecting.html to get started with the output stream.

Related

From FileInputStream to BufferedInputStream conversion

we were given a few exercises in lab and one of these is to convert the file transferring method from FileInputStream to BufferedInputStream. It's a client sending a GET request to a web server, which sends the file requested.
I came up with a simple solution, and I just wanted to check if it's correct.
Original code:
try {
FileInputStream fis = new FileInputStream(req);
// req, String containing file name
byte[] data = new byte [fis.available()];
fis.read(data);
out.write(data); // OutputStream out = socket.getOutputStream();
} catch (FileNotFoundException e){
new PrintStream(out).println("404 Not Found");
}
My try:
try {
BufferedInputStream bis = new BufferedInputStream (new FileInputStream(req));
byte[] data = new byte[4];
while(bis.read(data) > -1) {
out.write(data);
data = new byte[4];
}
} catch (FileNotFoundException e){
new PrintStream(out).println("404 Not Found");
}
The file is a web page named index.html, which contains a simple html page.
I have to reallocate the array every time, because at the last execution of the while loop, if the file isn't a multiple of 4 in size, the data array will contain characters from the previous execution, which are shown in the browser.
I chose 4 as data size for debugging purposes.
Output is correct.
Is this a good solution or can I do better?

There's no need to re-create the byte array each time - just overwrite it. More importantly though, you have a conceptual mistake inside your loop. Each iteration just writes the array to the stream assuming it's all valid. If you examine BufferedInputStream#read's documentation you'll see it may not read enough data to fill the entire array, and will return the number of bytes it actually read. You should use this number to limit the amount of bytes you're writing:
while((int len = bis.read(data)) > -1) {
out.write(data, 0, len);
}

I suggest you close off your file once you are done. The BufferedInputStream uses an 8 KB buffer by default which you are reducing to a smaller buffer. A simpler solution is to copy 8 KB at a time and not use the added buffer
try (InputStream in = new FileInputStream(req)) {
byte[] data = new byte[8 << 10];
for (int len; (len = bis.read(data)) > -1; )
out.write(data, 0, len);
} catch (IOException e) {
out.write("404 Not Found\n".getBytes());
}

Trim Padding From ByteArrayOutputStream

I'm working with Amazon S3 and would like to upload an InputStream (which requires counting the number of bytes I'm sending).
public static boolean uploadDataTo(String bucketName, String key, String fileName, InputStream stream) {
ByteArrayOutputStream out = new ByteArrayOutputStream();
byte[] buffer = new byte[1];
try {
while (stream.read(buffer) != -1) { // copy from stream to buffer
out.write(buffer); // copy from buffer to byte array
}
} catch (Exception e) {
UtilityFunctionsObject.writeLogException(null, e);
}
byte[] result = out.toByteArray(); // we needed all that just for length
int bytes = result.length;
IO.close(out);
InputStream uploadStream = new ByteArrayInputStream(result);
....
}
I was told copying a byte at a time is highly inefficient (obvious for large files). I can't make it more because it will add padding to the ByteArrayOutputStream, which I can't strip out. I can strip it out from result, but how can I do it safely? If I use an 8KB buffer, can I just strip out the right most buffer[i] == 0? Or is there a better way to do this? Thanks!
Using Java 7 on Windows 7 x64.

You can do something like this:
int read = 0;
while ((read = stream.read(buffer)) != -1) {
out.write(buffer, 0, read);
}
stream.read() returns the number of bytes that have been written into buffer. You can pass this information to the len parameter of out.write(). So you make sure that you write only the bytes you have read from the stream.

Use Jakarta Commons IOUtils to copy from the input stream to the byte array stream in a single step. It will use an efficient buffer, and not write any excess bytes.

If you want efficiency you could process the file as you read it. I would replace uploadStream with stream and remove the rest of the code.
If you need some buffering you can do this
InputStream uploadStream = new BufferedInputStream(stream);
the default buffer size is 8 KB.
If you want the length use File.length();
long length = new File(fileName).length();

GZIP decompress string and byte conversion

I have a problem in code:
private static String compress(String str)
{
String str1 = null;
ByteArrayOutputStream bos = null;
try
{
bos = new ByteArrayOutputStream();
BufferedOutputStream dest = null;
byte b[] = str.getBytes();
GZIPOutputStream gz = new GZIPOutputStream(bos,b.length);
gz.write(b,0,b.length);
bos.close();
gz.close();
}
catch(Exception e) {
System.out.println(e);
e.printStackTrace();
}
byte b1[] = bos.toByteArray();
return new String(b1);
}
private static String deCompress(String str)
{
String s1 = null;
try
{
byte b[] = str.getBytes();
InputStream bais = new ByteArrayInputStream(b);
GZIPInputStream gs = new GZIPInputStream(bais);
ByteArrayOutputStream baos = new ByteArrayOutputStream();
int numBytesRead = 0;
byte [] tempBytes = new byte[6000];
try
{
while ((numBytesRead = gs.read(tempBytes, 0, tempBytes.length)) != -1)
{
baos.write(tempBytes, 0, numBytesRead);
}
s1 = new String(baos.toByteArray());
s1= baos.toString();
}
catch(ZipException e)
{
e.printStackTrace();
}
}
catch(Exception e) {
e.printStackTrace();
}
return s1;
}
public String test() throws Exception
{
String str = "teststring";
String cmpr = compress(str);
String dcmpr = deCompress(cmpr);
}
This code throw java.io.IOException: unknown format (magic number ef1f)
GZIPInputStream gs = new GZIPInputStream(bais);
It turns out that when converting byte new String (b1) and the byte b [] = str.getBytes () bytes are "spoiled." At the output of the line we have already more bytes. If you avoid the conversion to a string and work on the line with bytes - everything works. Sorry for my English.
public String unZip(String zipped) throws DataFormatException, IOException {
byte[] bytes = zipped.getBytes("WINDOWS-1251");
Inflater decompressed = new Inflater();
decompressed.setInput(bytes);
byte[] result = new byte[100];
ByteArrayOutputStream buffer = new ByteArrayOutputStream();
while (decompressed.inflate(result) != 0)
buffer.write(result);
decompressed.end();
return new String(buffer.toByteArray(), charset);
}
I'm use this function to decompress server responce. Thanks for help.

You have two problems:
You're using the default character encoding to convert the original string into bytes. That will vary by platform. It's better to specify an encoding - UTF-8 is usually a good idea.
You're trying to represent the opaque binary data of the result of the compression as a string by just calling the String(byte[]) constructor. That constructor is only meant for data which is encoded text... which this isn't. You should use base64 for this. There's a public domain base64 library which makes this easy. (Alternatively, don't convert the compressed data to text at all - just return a byte array.)
Fundamentally, you need to understand how different text and binary data are - when you want to convert between the two, you should do so carefully. If you want to represent "non text" binary data (i.e. bytes which aren't the direct result of encoding text) in a string you should use something like base64 or hex. When you want to encode a string as binary data (e.g. to write some text to disk) you should carefully consider which encoding to use. If another program is going to read your data, you need to work out what encoding it expects - if you have full control over it yourself, I'd usually go for UTF-8.
Additionally, the exception handling in your code is poor:
You should almost never catch Exception; catch more specific exceptions
You shouldn't just catch an exception and continue as if it had never happened. If you can't really handle the exception and still complete your method successfully, you should let the exception bubble up the stack (or possibly catch it and wrap it in a more appropriate exception type for your abstraction)

When you GZIP compress data, you always get binary data. This data cannot be converted into string as it is no valid character data (in any encoding).
So your compress method should return a byte array and your decompress method should take a byte array as its parameter.
Futhermore, I recommend you use an explicit encoding when you convert the string into a byte array before compression and when you turn the decompressed data into a string again.

When you GZIP compress data, you always get binary data. This data
cannot be converted into string as it is no valid character data (in
any encoding).
Codo is right, thanks a lot for enlightening me. I was trying to decompress a string (converted from the binary data). What I amended was using InflaterInputStream directly on the input stream returned by my http connection. (My app was retrieving a large JSON of strings)

why initialize this byte array to 1024

I'm relatively new to Java and I'm attempting to write a simple android app. I have a large text file with about 3500 lines in the assets folder of my applications and I need to read it into a string. I found a good example about how to do this but I have a question about why the byte array is initialized to 1024. Wouldn't I want to initialize it to the length of my text file? Also, wouldn't I want to use char, not byte? Here is the code:
private void populateArray(){
AssetManager assetManager = getAssets();
InputStream inputStream = null;
try {
inputStream = assetManager.open("3500LineTextFile.txt");
} catch (IOException e) {
Log.e("IOException populateArray", e.getMessage());
}
String s = readTextFile(inputStream);
// Add more code here to populate array from string
}
private String readTextFile(InputStream inputStream) {
ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
inputStream.length
byte buf[] = new byte[1024];
int len;
try {
while ((len = inputStream.read(buf)) != -1) {
outputStream.write(buf, 0, len);
}
outputStream.close();
inputStream.close();
} catch (IOException e) {
Log.e("IOException readTextFile", e.getMessage());
}
return outputStream.toString();
}
EDIT: Based on your suggestions, I tried this approach. Is it any better? Thanks.
private void populateArray(){
AssetManager assetManager = getAssets();
InputStream inputStream = null;
Reader iStreamReader = null;
try {
inputStream = assetManager.open("List.txt");
iStreamReader = new InputStreamReader(inputStream, "UTF-8");
} catch (IOException e) {
Log.e("IOException populateArray", e.getMessage());
}
String String = readTextFile(iStreamReader);
// more code here
}
private String readTextFile(InputStreamReader inputStreamReader) {
StringBuilder sb = new StringBuilder();
char buf[] = new char[2048];
int read;
try {
do {
read = inputStreamReader.read(buf, 0, buf.length);
if (read>0) {
sb.append(buf, 0, read);
}
} while (read>=0);
} catch (IOException e) {
Log.e("IOException readTextFile", e.getMessage());
}
return sb.toString();
}

This example is not good at all. It's full of bad practices (hiding exceptions, not closing streams in finally blocks, not specify an explicit encoding, etc.). It uses a 1024 bytes long buffer because it doesn't have any way of knowing the length of the input stream.
Read the Java IO tutorial to learn how to read text from a file.

You are reading the file into a buffer of 1024 Bytes.
Then those 1024 bytes are written to outputStream.
This process repeats until the whole file is read into the outputStream.
As JB Nizet mentioned the example is full of bad practices.

Wouldn't I want to initialize it to the length of my text file? Also, wouldn't I want to use char, not byte?
Yes, and yes ... and as other answers have said, you've picked an example with a number of errors in it.
However, there is a theoretical problem doing both; i.e. setting the buffer length to the file length and using a character buffer rather than a byte buffer. The problem is that the file size is measured in bytes, but the size of the buffer needs to be measured in characters. This is normally fine, but it is theoretically possible that you will need more characters than the file size in bytes; e.g. if the input file used a 6 bit character set and packed 4 characters into 3 bytes.

To read from a file I usaully use a Scanner and a StringBuilder.
Scanner scan = new Scanner(new BufferedInputStream(new FileInputStream(filename)), "UTF-8");
StringBuilder sb = new StringBuilder();
while (scan.hasNextLine()) {
sb.append(scan.nextLine());
sb.append("\n");
}
scan.close
return sb.toString();
Try to throw your exceptions instead of swallowing them. The caller must know there was a problem reading your file.
Edit: Also note that using a BufferedInputStream is important. Otherwise it will try to read bytes by bytes which can be slow.

Java InputStream reading problem

I have a Java class, where I'm reading data in via an InputStream
byte[] b = null;
try {
b = new byte[in.available()];
in.read(b);
} catch (IOException e) {
e.printStackTrace();
}
It works perfectly when I run my app from the IDE (Eclipse).
But when I export my project and it's packed in a JAR, the read command doesn't read all the data. How could I fix it?
This problem mostly occurs when the InputStream is a File (~10kb).
Thanks!

Usually I prefer using a fixed size buffer when reading from input stream. As evilone pointed out, using available() as buffer size might not be a good idea because, say, if you are reading a remote resource, then you might not know the available bytes in advance. You can read the javadoc of InputStream to get more insight.
Here is the code snippet I usually use for reading input stream:
byte[] buffer = new byte[BUFFER_SIZE];
int bytesRead = 0;
while ((bytesRead = in.read(buffer)) >= 0){
for (int i = 0; i < bytesRead; i++){
//Do whatever you need with the bytes here
}
}
The version of read() I'm using here will fill the given buffer as much as possible and
return number of bytes actually read. This means there is chance that your buffer may contain trailing garbage data, so it is very important to use bytes only up to bytesRead.
Note the line (bytesRead = in.read(buffer)) >= 0, there is nothing in the InputStream spec saying that read() cannot read 0 bytes. You may need to handle the case when read() reads 0 bytes as special case depending on your case. For local file I never experienced such case; however, when reading remote resources, I actually seen read() reads 0 bytes constantly resulting the above code into an infinite loop. I solved the infinite loop problem by counting the number of times I read 0 bytes, when the counter exceed a threshold I will throw exception. You may not encounter this problem, but just keep this in mind :)
I probably will stay away from creating new byte array for each read for performance reasons.

read() will return -1 when the InputStream is depleted. There is also a version of read which takes an array, this allows you to do chunked reads. It returns the number of bytes actually read or -1 when at the end of the InputStream. Combine this with a dynamic buffer such as ByteArrayOutputStream to get the following:
InputStream in = ...
ByteArrayOutputStream buffer = new ByteArrayOutputStream();
int read;
byte[] input = new byte[4096];
while ( -1 != ( read = in.read( input ) ) ) {
buffer.write( input, 0, read );
}
input = buffer.toByteArray()
This cuts down a lot on the number of methods you have to invoke and allows the ByteArrayOutputStream to grow its internal buffer faster.

File file = new File("/path/to/file");
try {
InputStream is = new FileInputStream(file);
byte[] bytes = IOUtils.toByteArray(is);
System.out.println("Byte array size: " + bytes.length);
} catch (IOException e) {
e.printStackTrace();
}

Below is a snippet of code that downloads a file (*. Png, *. Jpeg, *. Gif, ...) and write it in BufferedOutputStream that represents the HttpServletResponse.
BufferedInputStream inputStream = bo.getBufferedInputStream(imageFile);
try {
ByteArrayOutputStream buffer = new ByteArrayOutputStream();
int bytesRead = 0;
byte[] input = new byte[DefaultBufferSizeIndicator.getDefaultBufferSize()];
while (-1 != (bytesRead = inputStream.read(input))) {
buffer.write(input, 0, bytesRead);
}
input = buffer.toByteArray();
response.reset();
response.setBufferSize(DefaultBufferSizeIndicator.getDefaultBufferSize());
response.setContentType(mimeType);
// Here's the secret. Content-Length should equal the number of bytes read.
response.setHeader("Content-Length", String.valueOf(buffer.size()));
response.setHeader("Content-Disposition", "inline; filename=\"" + imageFile.getName() + "\"");
BufferedOutputStream outputStream = new BufferedOutputStream(response.getOutputStream(), DefaultBufferSizeIndicator.getDefaultBufferSize());
try {
outputStream.write(input, 0, buffer.size());
} finally {
ImageBO.close(outputStream);
}
} finally {
ImageBO.close(inputStream);
}
Hope this helps.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Large file to base64 string array - java

Related

From FileInputStream to BufferedInputStream conversion

Trim Padding From ByteArrayOutputStream

GZIP decompress string and byte conversion

why initialize this byte array to 1024

Java InputStream reading problem

Categories

Resources