Remove Base64 prefix from InputStream

Remove Base64 prefix from InputStream - java

I have a Base64 encoded Image String residing in a File Server. The encoded String has a prefix (ex: "data:image/png;base64,") for support in popular modern browsers (it's obtained via JavaScript's Canvas.toDataURL() method). The client sends a request for the image to my server which verifies them and returns a stream of the Base64 encoded String.
If the client is a web client, the image can be displayed as is within an <img> tag by setting the src to the Base64 encoded String. However, if the client is an Android client, the String needs to be decoded into a Bitmap without the prefix. Though, this can be done fairly easily.
The Problem:
In order to simplify my code and not reinvent the wheel, I'm using an Image Library for the Android client to handle loading, displaying, and caching the images (Facebook's Fresco Library to be exact). However, no library seems to support Base64 decoding (I want my cake and to eat it too). A solution I came up with is to decode the Base64 String on the server as it is being streamed to the client.
The Attempt:
S3Object obj = s3Client.getObject(new GetObjectRequest(bucketName, keyName));
Base64.Decoder decoder = Base64.getDecoder();
//decodes the stream as it is being read
InputStream stream = decoder.wrap(obj.getObjectContent());
try{
return new StreamingOutput(){
#Override
public void write(OutputStream output) throws IOException, WebApplicationException{
int nextByte = 0;
while((nextByte = stream.read()) != -1){
output.write(nextByte);
}
output.flush();
output.close();
stream.close();
}
};
}catch(Exception e){
e.printStackTrace();
}
Unfortunately, the Fresco library still has a problem displaying the image (with no stack traces!). As there doesn't seem to be an issue on my server when decoding the stream (no stack traces either), it leads me to believe that it must be an issue with the prefix. Which leaves me with a dilemma.
The Question: How do I remove the Base64 prefix from a Stream being sent to the client without storing and editing the entire Stream on the server? Is this possible?

Fresco does support decoding data URIs, just as the web client does.
The demo app has an example of this.

How do I remove the Base64 prefix from a Stream being sent to the client without storing and editing the entire Stream on the server?
Removing the prefix while sending the stream to the client turns out to be a pretty complex task. If you don't mind storing the whole String on the server you could simply do:
BufferedReader br = null;
StringBuilder sb = new StringBuilder();
String line;
try {
br = new BufferedReader(new InputStreamReader(stream));
while ((line = br.readLine()) != null) {
sb.append(line);
}
String result = sb.toString();
//comma is the charater which seperates the prefix and the Base64 String
int i = result.indexOf(",");
result = result.substring(i + 1);
//Now, that we have just the Base64 encoded String, we can decode it
Base64.Decoder decoder = Base64.getDecoder();
byte[] decoded = decoder.decode(result);
//Now, just write each byte from the byte array to the output stream
} catch (IOException e) {
e.printStackTrace();
} finally {
if (br != null) {
try {
br.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
But to be more efficient and not store the entire Stream on the server, creates a much more complicated task. We could use the Base64.Decoder.wrap() method but the problem with that is that it throws an IOException if it reaches a value that cannot be decoded (wouldn't it be nice if they provided a method that just left the bytes as is if they can't be decoded?). And unfortunately, the Base64 prefix can't be decoded because it's not Base64 encoded. So, it would throw an IOException.
To get around this problem, we would have to use an InputStreamReader to read the InputStream with the specified appropriate Charset. Then we would have to cast the ints received from the InputStream's read() method call to chars. When we reach the appropriate amount of chars, we would have to compare it with the Base64 prefix's intro ("data"). If it's a match, we know the Stream contains the prefix, so continue reading until we reach the prefix end character (the comma: ","). Finally, we can begin streaming out the bytes after the prefix. Example:
S3Object obj = s3Client.getObject(new GetObjectRequest(bucketName, keyName));
Base64.Decoder decoder = Base64.getDecoder();
InputStream stream = obj.getObjectContent();
InputStreamReader reader = new InputStreamReader(stream);
try{
return new StreamingOutput(){
#Override
public void write(OutputStream output) throws IOException, WebApplicationException{
//for checking if string has base64 prefix
char[] pre = new char[4]; //"data" has at most four bytes on a UTF-8 encoding
boolean containsPre = false;
int count = 0;
int nextByte = 0;
while((nextByte = stream.read()) != -1){
if(count < pre.length){
pre[count] = (char) nextByte;
count++;
}else if(count == pre.length){
//determine whether has prefix or not and act accordingly
count++;
containsPre = (Arrays.toString(pre).toLowerCase().equals("data")) ? true : false;
if(!containsPre){
//doesn't have Base64 prefix so write all the bytes until this point
for(int i = 0; i < pre.length; i++){
output.write((int) pre[i]);
}
output.write(nextByte);
}
}else if(containsPre && count < 25){
//the comma character (,) is considered the end of the Base64 prefix
//so look for the comma, but be realistic, if we don't find it at about 25 characters
//we can assume the String is not encoded correctly
containsPre = (Character.toString((char) nextByte).equals(",")) ? false : true;
count++;
}else{
output.write(nextByte);
}
}
output.flush();
output.close();
stream.close();
}
};
}catch(Exception e){
e.printStackTrace();
return null;
}
This seems a bit hefty of a task to do on the server so I think decoding on the client side is a better choice. Unfortunately, most Android client side libraries don't have support for Base64 decoding (especially with the prefix). However, as #tyronen pointed out Fresco does support it if the String is already obtained. Though, this removes one of the key reasons to use an image loading library.
Android Client Side Decoding
To decode on the client side application is pretty easy. First obtain the String from the InputStream:
BufferedReader br = null;
StringBuilder sb = new StringBuilder();
String line;
try {
br = new BufferedReader(new InputStreamReader(stream));
while ((line = br.readLine()) != null) {
sb.append(line);
}
return sb.toString();
} catch (IOException e) {
e.printStackTrace();
} finally {
if (br != null) {
try {
br.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
Then decode the String using Android's Base64 class:
int i = result.indexOf(",");
result = result.substring(i + 1);
byte[] decodedString = Base64.decode(result, Base64.DEFAULT);
Bitmap bitMap = BitmapFactory.decodeByteArray(decodedString, 0, decodedString.length);
The Fresco library seems hard to update due to them using a lot of delegation. So, I moved on to using the Picasso image loading library and created my own fork of it with the Base64 decoding ability.

Related

Compare PHP hash_file with Java output

I have the output of UTF-8 hash_file that I need to calculate and check on my java client. Based on the hash_file manual I'm extracting the contents of the file and create the MD5 hash hex on Java, but I can't make them match. I tried suggestions on [this question] without success2.
Here's how I do it on Java:
public static String calculateStringHash(String text, String encoding)
throws NoSuchAlgorithmException, UnsupportedEncodingException{
MessageDigest md = MessageDigest.getInstance("MD5");
return getHex(md.digest(text.getBytes(encoding)));
}
My results match the ones from this page.
For example:
String jake: 1200cf8ad328a60559cf5e7c5f46ee6d
From my Java code: 1200CF8AD328A60559CF5E7C5F46EE6D
But when trying on files it doesn't work. Here's the code for the file function:
public static String calculateHash(File file) throws NoSuchAlgorithmException,
FileNotFoundException, IOException {
BufferedReader br = null;
StringBuilder sb = new StringBuilder();
try {
String sCurrentLine;
br = new BufferedReader(new FileReader(file));
while ((sCurrentLine = br.readLine()) != null) {
sb.append(sCurrentLine);
}
} catch (IOException ex) {
LOG.log(Level.SEVERE, null, ex);
} finally {
try {
if (br != null) {
br.close();
}
} catch (IOException ex) {
LOG.log(Level.SEVERE, null, ex);
}
}
return calculateStringHash(sb.toString(),"UTF-8");
}
I verified that on the PHP side hash_file is used and UTF-8 is the encryption. Any ideas?

Your reading method removes all the end of lines from the file. readLine() returns a line, without its line terminator. Print the contents of the StringBuilder, and you'll understand the problem.
Moreover, a hashing algorithm is a binary operation. It operates on bytes, and returns bytes. Why are you transforming the bytes in the file into a String, to later transform the String back to an array of bytes in order to hash it. Just read the file as a byte array, using an InputStream, instead of reading it as a String. Then hash this byte array. This will also avoid using the wrong file encoding (your code uses the platform default encoding, which might not be the encding used to create the file).

I guess you are missing out on the new line characters from the file since you call br.readLine().
It is better to read the file into byte array, and pass that onto md.digest(...).

Download file line by line java

I know this question might sound really basic for most of you. I need to download a large file from server. The first line of this file contains a time tag. I want to download entire file only if my time tag mismatches to that of file. For this I'm using the given code. However, I'm not sure if this actually prevents file from uselessly downloading entire file.
Please help me out !
public String downloadString(String url,String myTime)
{
try {
URL url1 = new URL(url);
URLConnection tc = url1.openConnection();
tc.setConnectTimeout(timeout);
tc.setReadTimeout(timeout);
BufferedReader br = new BufferedReader(new InputStreamReader(tc.getInputStream()));
StringBuilder sb = new StringBuilder();
String line;
while ((line = br.readLine()) != null) {
if(line.contains(myTime))
{
Log.d("TIME CHECK", "Article already updated");
break;
}
sb.append(line+"\n");
}
br.close();
return sb.toString();
}
catch(Exception e)
{
Log.d("Error","In JSON downloading");
}
return null;
}

No, there is no easy way to control exactly to the last byte what will be downloaded. Even at the Java level you are involving a BufferedReader, which will obviously download more than you ask for, buffering it. There are other buffers as well, including at the OS level, which you cannot control. The proper technique to download only new files with HTTP is to use the IfModifiedSince header.

Your code won't download the whole file but as the BufferedReader has a default buffer size of 8192 you will read at least that many characters.

You can go byte-by-byte or chunk-by-chunk if it is the size
BufferedInputStream in = new BufferedInputStream(url).openStream())
byte data[] = new byte[1024];
int count;
while((count = in.read(data,0,1024)) != -1)
{
out.write(data, 0, count);
}
Check this question please
How to download and save a file from Internet using Java?

issues in reading google text document with google apis

I am trying to use following code to read a Google text document. But the value returned is a stream with garbage characters instead of the real contents. How can I fix this.
for (DocumentListEntry entry : resultFeed.getEntries()) {
String docId = entry.getDocId();
String docType = entry.getType();
URL exportUrl = new URL("https://docs.google.com/feeds/download/"
+ docType
+ "s/Export?docID="
+ docId
+ "&exportFormat=doc");
MediaContent mc = new MediaContent();
mc.setUri(exportUrl.toString());
MediaSource ms = client.getMedia(mc);
InputStream inStream = null;
try {
inStream = ms.getInputStream();
int c;
while ((c = inStream.read()) != -1) {
System.out.print((char)c);
}
} finally {
if (inStream != null) {
inStream.close();
}
}
}

From a quick read of the documentation, it looks like you are reading the raw bytes of a Microsoft Word-encoded document.
Try changing the &exportFormat=doc to html or txt and see if the output makes more sense.

I suspect that the files you are trying to print out have some other encoding but you're printing them byte by byte in ASCII way. I would try to read the whole stream as byte array and then convert it to string using some other encoding (e.g. UTF8).

why initialize this byte array to 1024

I'm relatively new to Java and I'm attempting to write a simple android app. I have a large text file with about 3500 lines in the assets folder of my applications and I need to read it into a string. I found a good example about how to do this but I have a question about why the byte array is initialized to 1024. Wouldn't I want to initialize it to the length of my text file? Also, wouldn't I want to use char, not byte? Here is the code:
private void populateArray(){
AssetManager assetManager = getAssets();
InputStream inputStream = null;
try {
inputStream = assetManager.open("3500LineTextFile.txt");
} catch (IOException e) {
Log.e("IOException populateArray", e.getMessage());
}
String s = readTextFile(inputStream);
// Add more code here to populate array from string
}
private String readTextFile(InputStream inputStream) {
ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
inputStream.length
byte buf[] = new byte[1024];
int len;
try {
while ((len = inputStream.read(buf)) != -1) {
outputStream.write(buf, 0, len);
}
outputStream.close();
inputStream.close();
} catch (IOException e) {
Log.e("IOException readTextFile", e.getMessage());
}
return outputStream.toString();
}
EDIT: Based on your suggestions, I tried this approach. Is it any better? Thanks.
private void populateArray(){
AssetManager assetManager = getAssets();
InputStream inputStream = null;
Reader iStreamReader = null;
try {
inputStream = assetManager.open("List.txt");
iStreamReader = new InputStreamReader(inputStream, "UTF-8");
} catch (IOException e) {
Log.e("IOException populateArray", e.getMessage());
}
String String = readTextFile(iStreamReader);
// more code here
}
private String readTextFile(InputStreamReader inputStreamReader) {
StringBuilder sb = new StringBuilder();
char buf[] = new char[2048];
int read;
try {
do {
read = inputStreamReader.read(buf, 0, buf.length);
if (read>0) {
sb.append(buf, 0, read);
}
} while (read>=0);
} catch (IOException e) {
Log.e("IOException readTextFile", e.getMessage());
}
return sb.toString();
}

This example is not good at all. It's full of bad practices (hiding exceptions, not closing streams in finally blocks, not specify an explicit encoding, etc.). It uses a 1024 bytes long buffer because it doesn't have any way of knowing the length of the input stream.
Read the Java IO tutorial to learn how to read text from a file.

You are reading the file into a buffer of 1024 Bytes.
Then those 1024 bytes are written to outputStream.
This process repeats until the whole file is read into the outputStream.
As JB Nizet mentioned the example is full of bad practices.

Wouldn't I want to initialize it to the length of my text file? Also, wouldn't I want to use char, not byte?
Yes, and yes ... and as other answers have said, you've picked an example with a number of errors in it.
However, there is a theoretical problem doing both; i.e. setting the buffer length to the file length and using a character buffer rather than a byte buffer. The problem is that the file size is measured in bytes, but the size of the buffer needs to be measured in characters. This is normally fine, but it is theoretically possible that you will need more characters than the file size in bytes; e.g. if the input file used a 6 bit character set and packed 4 characters into 3 bytes.

To read from a file I usaully use a Scanner and a StringBuilder.
Scanner scan = new Scanner(new BufferedInputStream(new FileInputStream(filename)), "UTF-8");
StringBuilder sb = new StringBuilder();
while (scan.hasNextLine()) {
sb.append(scan.nextLine());
sb.append("\n");
}
scan.close
return sb.toString();
Try to throw your exceptions instead of swallowing them. The caller must know there was a problem reading your file.
Edit: Also note that using a BufferedInputStream is important. Otherwise it will try to read bytes by bytes which can be slow.

How can I read a .txt file into a single Java string while maintaining line breaks?

Virtually every code example out there reads a TXT file line-by-line and stores it in a String array. I do not want line-by-line processing because I think it's an unnecessary waste of resources for my requirements: All I want to do is quickly and efficiently dump the .txt contents into a single String. The method below does the job, however with one drawback:
private static String readFileAsString(String filePath) throws java.io.IOException{
byte[] buffer = new byte[(int) new File(filePath).length()];
BufferedInputStream f = null;
try {
f = new BufferedInputStream(new FileInputStream(filePath));
f.read(buffer);
if (f != null) try { f.close(); } catch (IOException ignored) { }
} catch (IOException ignored) { System.out.println("File not found or invalid path.");}
return new String(buffer);
}
... the drawback is that the line breaks are converted into long spaces e.g. " ".
I want the line breaks to be converted from \n or \r to <br> (HTML tag) instead.
Thank you in advance.

What about using a Scanner and adding the linefeeds yourself:
sc = new java.util.Scanner ("sample.txt")
while (sc.hasNext ()) {
buf.append (sc.nextLine ());
buf.append ("<br />");
}
I don't see where you get your long spaces from.

You can read directly into the buffer and then create a String from the buffer:
File f = new File(filePath);
FileInputStream fin = new FileInputStream(f);
byte[] buffer = new byte[(int) f.length()];
new DataInputStream(fin).readFully(buffer);
fin.close();
String s = new String(buffer, "UTF-8");

You could add this code:
return new String(buffer).replaceAll("(\r\n|\r|\n|\n\r)", "<br>");
Is this what you are looking for?

The code will read the file contents as they appear in the file - including line breaks.
If you want to change the breaks into something else like displaying in html etc, you will either need to post process it or do it by reading the file line by line. Since you do not want the latter, you can replace your return by following which should do the conversion -
return (new String(buffer)).replaceAll("\r[\n]?", "<br>");

StringBuilder sb = new StringBuilder();
try {
InputStream is = getAssets().open("myfile.txt");
byte[] bytes = new byte[1024];
int numRead = 0;
try {
while((numRead = is.read(bytes)) != -1)
sb.append(new String(bytes, 0, numRead));
}
catch(IOException e) {
}
is.close();
}
catch(IOException e) {
}
your resulting String: String result = sb.toString();
then replace whatever you want in this result.

I agree with the general approach by #Sanket Patel, but using Commons I/O you would likely want File Utils.
So your code word look like:
String myString = FileUtils.readFileToString(new File(filePath));
There is also another version to specify an alternate character encoding.

You should try org.apache.commons.io.IOUtils.toString(InputStream is) to get file content as String. There you can pass InputStream object which you will get from
getAssets().open("xml2json.txt") *<<- belongs to Android, which returns InputStream*
in your Activity. To get String use this :
String xml = IOUtils.toString((getAssets().open("xml2json.txt")));
So,
String xml = IOUtils.toString(*pass_your_InputStream_object_here*);

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Remove Base64 prefix from InputStream - java

Fresco does support decoding data URIs, just as the web client does. The demo app has an example of this.

Related

Compare PHP hash_file with Java output

Download file line by line java

issues in reading google text document with google apis

why initialize this byte array to 1024

How can I read a .txt file into a single Java string while maintaining line breaks?

Categories

Resources