Ruby base 64 decoding for Java base 64 encoding - java

I having a string that is encoded in java using
data = new String(Base64.getEncoder().encode(encVal), StandardCharsets.UTF_8);
I am receiving this encoded data as an API response. I want to base64 decode this in ruby. I am using
Base64.strict_decode64(data)
for this. but this is not working. Can anyone help me with this?

Your Java code is correct:
byte[] encVal = "Hello World".getBytes();
String data = new String(Base64.getEncoder().encode(encVal), StandardCharsets.UTF_8);
System.out.println(data); // SGVsbG8gV29ybGQ=
The SGVsbG8gV29ybGQ= decodes correctly using multiple tools, e.g. https://www.base64decode.org/.
You are observing garbage characters decoding your value most likely due to an error in creating byte[]. Possibly you have to specify the correct encoding when creating byte[].

Related

Decoding String (from header) encoded by Base64 and RFC2047 in Java

I'm working on a function to decode a string (from a header) that is encoded in both Base64 and RFC2047 in Java.
Given this header:
SGVhZGVyOiBoZWFkZXJ2YWx1ZQ0KQmFkOiBOYW1lOiBiYWRuYW1ldmFsdWUNClVuaWNvZGU6ID0/VVRGLTg/Qj81YmV4NXF5eTU2dUw2SUNNNTZ1TDVMcTY3N3lNNWJleDVxeXk2WUdVNklDTTZZR1U/PSA9P1VURi04P0I/NUxxNjc3eU01YmV4NW9tQTVMaU41cXl5Nzd5TTVZdS81cGE5NXBhODVMcTY0NENDPz0NCg0K
My expected output is:
Header: headervalue Bad: Name: badnamevalue Unicode:
己欲立而立人,己欲達而達人,己所不欲,勿施於人。
The only relevant function that I have found and tried was Base64.decodeBase64(headers), which produces this when printed out:
Header: headervalue Bad: Name: badnamevalue Unicode:
=?UTF-8?B?5bex5qyy56uL6ICM56uL5Lq677yM5bex5qyy6YGU6ICM6YGU?= =?UTF-8?B?5Lq677yM5bex5omA5LiN5qyy77yM5Yu/5pa95pa85Lq644CC?=
To solve this, I've been trying MimeUtility.decode() by converting the byte array returned from Base64.decodeBase64(headers) to InputStream, but the result was identical as above.
InputStream headerStream = new ByteArrayInputStream(Base64.decodeBase64(headers));
InputStream result = MimeUtility.decode(headerStream, "quoted-printable");
Have been searching around the internet but have yet found a solution, wondering if anyone knows ways to decode MIME headers from resulted byte arrays?
Any help is appreciated! It's also my first stack overflow post, apologies if I'm missing anything but please let me know if there's more information that I can provide!
The base64 you have there actually is what you pasted. Including the bizarre =?UTF-8?B? weirdness.
The stuff that follows is again base64.
There's base64-encoded data inside your base-64 encoded data. As Xzibit would say: I put some Base64 in your base64 so you can base64 while you base64. Why do I feel old all of a sudden?
In other words, the base64 input you get is a crazy, extremely inefficient format invented by a crazy person.
My advice is that you tell them to come up with something less insane.
Failing that:
Search the resulting string for the regex pattern and then again apply base64 decode to the stuff in the middle.
Also, you're using some third party base64 decoder, probably apache. Apache libraries tend to suck. Base64 is baked into java, there is no reason to use worse libraries here. I've fixed that; the Base64 in this snippet is java.util.Base64. Its API is slightly different.
String sourceB64 = "SGV..."; // that input base64 you have.
byte[] sourceBytes = Base64.decodeBase64(sourceB64);
String source = new String(sourceBytes, StandardCharsets.UTF_8);
Pattern p = Pattern.compile("=\\?UTF-8\\?B\\?(.*?)\\?=");
Matcher m = p.matcher(source);
StringBuilder out = new StringBuilder();
int curPos = 0;
while (m.find()) {
out.append(source.substring(curPos, m.start()));
curPos = m.end();
String content = new String(Base64.getDecoder().decode(m.group(1)), StandardCharsets.UTF_8);
out.append(content);
}
out.append(source.substring(curPos));
System.out.println(out.toString());
If I run that, I get:
Header: headervalue
Bad: Name: badnamevalue
Unicode: 己欲立而立人,己欲達而達 人,己所不欲,勿施於人。
Which looks exactly like what you want.
Explanation of that code:
It first base64-decodes the input, and turns that into a string. (Your idea of using InputStream is a red herring. That doesn't help at all here. You just want to turn bytes into a string, you do it as per line 3 of that snippet. Pass the byte array and the encoding those bytes are in, that's all you need to do).
It then goes on the hunt for =?UTF-8?B?--base64here--?= inside your base64. The base64-in-the-base64.
It then decoder that base64, turns it into a string in the same fashion, and replaces it.
It just adds everything besides those =?UTF-8?B?...?= segments verbatim.

How to get the same MD5 string in Java as in C#

I have code in C# which produces MD5 encoded byte[] from String and then this byte[] is converted to String. The C# code is
byte[] valueBytes = (new UnicodeEncoding()).GetBytes(value);
byte[] newHash = (new MD5CryptoServiceProvider()).ComputeHash(valueBytes);
I need to get the same result in Java. I'm trying to do this
Charset utf16 = Charset.forName("UTF-16");
return new String(DigestUtils.md5(value.getBytes(utf16)), utf16);
The code is using Apache Commons Codec library for MD5 calculations. I'm using UTF16 charset because I've read in other SO questions that C#'s UnicodeEncoding uses it by default.
So the code snippets look like they do the same thing, but when I'm passing the string byndyusoft2014, C# gives me hV7u6mQYRgBXXF9jOWWYJg== and Java gives me ﹡둛뭶魙ꇥ늺ꢑ. I've tried UTF16LE and UTF16BE as charsets with no luck.
Has anyone idea about what I'm doing wrong?
I think because of the java decode string to byte[] with utf-8,but the C# is not.So the java and C# encode the different byte array,and then get the different result.You can decode the string to byte[] at c# with utf-8,and see the result.Like following code:
UTF8Encoding utf8 = new UTF8Encoding();
byte[] bytes=utf8.GetBytes("byndyusoft2014");
byte[] en=(new MD5CryptoServiceProvider()).ComputeHash(bytes);
Console.WriteLine(Convert.ToBase64String(en));
and the java code:
byte[] en = DigestUtils.md5Digest("byndyusoft2014".getBytes());
byte[] base64 = Base64Utils.encode(en);
System.out.println(new String(base64));
Of course,in your description,the result of C# like be encoded with base64,so the java should encode the byte array with base64.
The result of them is same as swPvmbGDI1GbPKQwL9knjQ==
The DigestUtils and Base64Utils is some implementation of MD5 and BAS64 in spring library
As it turned out, the main difference was not presented in my original code snippet - it was convertation from MD5 encoded byte[] to String. You need to use Base64 to get final result. This is the working code snippet in Java
Charset utf16 = Charset.forName("UTF-16LE");
return new String(Base64.encodeBase64(DigestUtils.md5(value.getBytes(utf16))));
With this code I get the same result as with C#. Thank you all for good hints!

Converting decode utf-8 string to file

I am trying to save image which I am receiving from android device. From Android getting utf-8 encode string and below is the code I am using to save.
String test = java.net.URLDecoder.decode(image_base64, "UTF-8");
byte[] data = Base64.decodeBase64(test.getBytes());
FileOutputStream stream = null;
try {
stream = new FileOutputStream("/var/lib/easy-tomcat7/webapps/test/test1.bmp");
stream.write(data);
stream.flush();
test1 += "success";
}
catch (IOException e)
{
test1 = "failuare";
e.getMessage();
}
finally
{
test1 += "finally";
stream.close();
}
File is creating but the it is corrupted. I did lot of research on this but not getting why it is happening. Please help me to solve this issue.
I assume you are using Base64 from Apache Commons Codec.
Note that you are dealing with multiple different kinds of encodings:
URL encoding
Base64 encoding
UTF-8 character encoding
Those are three totally different things, and you should understand all of them to understand what is happening exactly.
Check how exactly the image is encoded that you get from the Android device. Your code is assuming that you are getting it as URL-encoded Base64 data, using the UTF-8 character set. Is that indeed how the Android device is sending the data? You will have to check that with whoever wrote the Android application.
What does the string image_base64 contain? Is it valid, URL-encoded Base64 data?
You shouldn't call getBytes() on the string before you pass it to Base64.decodeBase64 - that will convert the string into a byte array using the default character encoding of the system you're running it on. Just do this instead:
byte[] data = Base64.decodeBase64(test);
To make matters worse, there are several variants of Base64 encoding (as you can see on the Wikipedia page about Base64). It may be the case that whatever variant the Android app used is different from what the Base64 class is using.
Use the encoding also for getBytes()
Base64.decodeBase64(test.getBytes("utf-8"));

UTF8 convertion for text obtained from internet

ElasticSearch is a search Server which accepts data only in UTF8.
When i tries to give ElasticSearch following text
Small businesses potentially in line for a lighter reporting load include those with an annual turnover of less than £440,000, net assets of less than £220,000 and fewer than ten employees"
Through my java application - Basically my java application takes this info from a webpage , and gives it to elasticSearch. ES complaints it cant understand £ and it fails. After filtering through below code -
byte bytes[] = s.getBytes("ISO-8859-1");
s = new String(bytes, "UTF-8");
Here £ is converted to �
But then when I copy it to a file in my home directory using bash and it goes in fine. Any pointers will help.
You have ISO-8895-1 octets in bytes, which you then tell String to decode as if it were UTF-8. When it does that, it doesn't recognize the illegal 0xA3 sequence and replaces it with the substitution character.
To do this, you have to construct the string with the encoding it uses, then convert it to the encoding that you want. See How do I convert between ISO-8859-1 and UTF-8 in Java?.
UTF-8 is easier than one thinks. In String everything is unicode characters.
Bytes/string conversion is done as follows.
(Note Cp1252 or Windows-1252 is the Windows Latin1 extension of ISO-8859-1; better use
that one.)
BufferedReader in = new BufferedReader(
new InputStreamReader(new FileInputStream(file), "Cp1252"));
PrintWriter out = new PrintWriter(
new OutputStreamWriter(new FileOutputStream(file), "UTF-8"));
response.setContentType("text/html; charset=UTF-8");
response.setEncoding("UTF-8");
String s = "20 \u00A3"; // Escaping
To see why Cp1252 is more suitable than ISO-8859-1:
http://en.wikipedia.org/wiki/Windows-1252
String s is a series of characters that are basically independent of any character encoding (ok, not exactly independent, but close enough for our needs now). Whatever encoding your data was in when you loaded it into a String has already been decoded. The decoding was done either using system default encoding (which is practically ALWAYS AN ERROR, do not ever use system default encoding, trust me I have over 10 years of experience in dealing with bugs related to wrong default encodings) or the encoding you explicitely specified when you loaded the data.
When you call getBytes("ISO-8859-1") for a String, you request that the String is encoded into bytes according to ISO-8859-1 encoding.
When you create a String from a byte array, you need to specify the encoding in which the characters in the byte array are represented. You create a string from a byte array that has been encoded in UTF-8 (and just above you encoded it in ISO-8859-1, that is your error).
What you want to do is:
byte bytes[] = s.getBytes("UTF-8");
s = new String(bytes, "UTF-8");

Java's new Base64(-1) in PHP?

I m trying to match java base64 code in php. But getting inconsistent result.
Java base64 encode
encMessage = URLEncoder.encode(new Base64(-1).encodeToString(encrypted),"UTF8");
Java decode
message = URLDecoder.decode(message,"utf8");
Above code java encode code return the string which i have to decode and decrypt in php
PHP base64 decode
$message = utf8_decode(urldecode($encrypted));
$message = base64_decode($message);
PHP encode
$encMessage = base64_encode($encrypted);
$encMessage = utf8_encode(urlencode($encMessage));
Results:
java:
KO%2F%2B%2Bzbp5z8oCdvZn62jb72kseT%2Bem8hYUZY0IuB9zo%3D
php:
KO%2F%2B%2Bzbp5z8oCdvZn62jb3CVVVXsV%2Bws2kDOmKK%2BPEc%3D
src : https://gist.github.com/944269
I had this problem between CSharp and Java, and found that URL encoding things was the culprit. What I did in my work around was basically re-encrypt the data with a newly generated public key until I got one that didn't need URL encoding. Not a great solution, but it works, it averages 2 tries to get it right, but I've seen it taking up to 15 tries to do it, either way we're still talking milliseconds, and it works reliably.
YMMV

Categories