How to decode a Base64 string in Scala or Java? - java

I have a string encoded in Base64:
eJx9xEERACAIBMBKJyKDcTzR_hEsgOxjAcBQFVVNvi3qEsrRnWXwbhHOmzWnctPHPVkPu-4vBQ==
How can I decode it in Scala language?
I tried to use:
val bytes1 = new sun.misc.BASE64Decoder().decodeBuffer(compressed_code_string)
But when I compare the byte array with the correct one that I generated in Python language, there is an error. Here is the command I used in python:
import base64
base64.urlsafe_b64decode(compressed_code_string)
The Byte Array in Scala is:
(120, -100, 125, -60, 65, 17, 0, 32, 8, 4, -64, 74, 39, 34, -125, 113, 60, -47, -2, 17, 44, -128, -20, 99, 1, -64, 80, 21, 85, 77, -66, 45, -22, 18, -54, -47, -99, 101, -16, 110, 17, -50, -101, 53, -89, 114, -45, -57, 61, 89, 15, -69, -2, 47, 5)
And the one generated in python is:
(120, -100, 125, -60, 65, 17, 0, 32, 8, 4, -64, 74, 39, 34, -125, 113, 60, -47, -2, 17, 44, -128, -20, 99, 1, -64, 80, 21, 85, 77, -66, 45, -22, 18, -54, -47, -99, 101, -16, 110, 17, -50, -101, 53, -89, 114, -45, -57, 61, 89, 15, -69, -18, 47, 5)
Note that there is a single difference in the end of the array

In Scala, Encoding a String to Base64 and decoding back to the original String using Java APIs:
import java.util.Base64
import java.nio.charset.StandardCharsets
scala> val bytes = "foo".getBytes(StandardCharsets.UTF_8)
bytes: Array[Byte] = Array(102, 111, 111)
scala> val encoded = Base64.getEncoder().encodeToString(bytes)
encoded: String = Zm9v
scala> val decoded = Base64.getDecoder().decode(encoded)
decoded: Array[Byte] = Array(102, 111, 111)
scala> val str = new String(decoded, StandardCharsets.UTF_8)
str: String = foo

There is unfortunately not just one Base64 encoding. The - character doesn't have the same representation in all encodings. For example, in the MIME encoding, it's not used at all. In the encoding for URLs, it is a value of 62--and this is the one that Python is using. The default sun.misc decoder wants + for 62. If you change the - to +, you get the correct answer (i.e. the Python answer).
In Scala, you can convert the string s to MIME format like so:
s.map{ case '-' => '+'; case '_' => '/'; case c => c }
and then the Java MIME decoder will work.

Both Python and Java are correct in terms of the decoding. They are just using a different RFC for this purpose. Python library is using RFC 3548 and the used java library is using RFC 4648 and RFC 2045.
Changing the hyphen(-) into a plus(+) from your input string will make the both decoded byte data are similar.

Related

Base64 decoding differences

I've been trying to figure out why Base64 decoding is different between Dart and Java.
Dart example code
import 'dart:convert';
var str = '640gPKMxZZbeLDIUeXiZmg==';
var dec = base64.decode(str);
print(dec);
prints: [235, 141, 32, 60, 163, 49, 101, 150, 222, 44, 50, 20, 121, 120, 153, 154]
Java example code
import java.util.Base64;
String str = "640gPKMxZZbeLDIUeXiZmg==";
byte[] dec = Base64.getDecoder().decode(str);
System.out.println(Arrays.toString(dec));
prints: [-21, -115, 32, 60, -93, 49, 101, -106, -34, 44, 50, 20, 121, 120, -103, -102]
Any ideas? As far as I'm aware they both implement RFC4648.
For the Dart code, I did try using base64url and the normalize function which didn't change anything (to be expected I suppose). Not too sure what else to try.

Unpredictable behaviour of JTextField removing ASCII char

I have been working on some application which encrypts data using 3DES algorithm, but when I tried to make some UI for it I faced some really strange behaviour, after I encrypt data and represent it back again as ASCII I get:
String encryptedASCII = ;nÆ«»Ë?&]º²ÿ
and following array of bytes for it:
[-62, -118, 59, 110, -61, -122, -62, -85, -62, -69, -61, -117, 4, 7, -62, -105, 63, 38, 93, -62, -70, -62, -78, -61, -65]
But when I use:
textField.setText(encryptedASCII)
and once again get it from there to decrypt:
textField.getText()
I got:
;nÆ«»Ë?&]º²ÿ
and bytes for it:
[-62, -118, 59, 110, -61, -122, -62, -85, -62, -69, -61, -117, -62, -105, 63, 38, 93, -62, -70, -62, -78, -61, -65]
What makes it two missing bytes [4, 7] from the array of bytes which I got before I set up text field with its representation as ASCII.
Is there anything what I'm missing here ? I can't decrypt it back again having data changed over setting it up in text field.

String (bytes[] Charset) is returning results differently in Java7 and java 8

import java.io.UnsupportedEncodingException;
import java.nio.charset.StandardCharsets;
import java.util.Arrays;
public class Java87String {
public static void main(String[] args) throws UnsupportedEncodingException {
// TODO Auto-generated method stub
//byte[] b = {-101, 53, -51, -26, 24, 60, 20, -31, -6, 45, 50, 103, -66, 28, 114, -39, 92, 23, -47, 32, -5, -122, -28, 79, 22, -76, 116, -122, -54, -122};
//byte[] b = {-76, -55, 85, -50, 80, -23, 27, 62, -94, -74, 47, -123, -119, 94, 90, 61, -63, 73, 56, -48, -54, -4, 11, 79};
byte[] b = { -5, -122, -28};
System.out.println("Input Array :" + Arrays.toString(b));
System.out.println("Array Length : " + b.length);
String target = new String(b,StandardCharsets.UTF_8);
System.out.println(Arrays.toString(target.getBytes("UTF-8")));
System.out.println("Final Key :" + target);
}
}
The above code returns the following output in Java 7
Input Array :[-5, -122, -28]
Array Length : 3
[-17, -65, -67]
Final Key :�
The Same code returns the following output in Java 8
Input Array :[-5, -122, -28]
Array Length : 3
[-17, -65, -67, -17, -65, -67, -17, -65, -67]
Final Key :���
Sounds like Java8 is doing the right thing of replacing with the default sequence of [-17, -65, -67].
Why is there a difference in output and Any Known bugs in JDK 1.7 which fixes this issue?
Per the String JavaDoc:
The behavior of this constructor when the given bytes are not valid in the given charset is unspecified. The CharsetDecoder class should be used when more control over the decoding process is required.
I think (-5, -122, -28) is a invalid UTF-8 byte sequence, so the JVM may output anything in this case. If it were a valid one, maybe the different Java versions could show the same output.
Does this specific byte sequence have a meaning? just curious

Retrieving bytes of String returns different results in ObjC than Java

I've got a string that I'm trying to convert to bytes in order to create an md5 hash in both ObjC and Java. For some reason, the bytes are different between the two languages.
Java
System.out.println(Arrays.toString(
("78b4a02fa139a2944f17b4edc22fb175:8907f3c4861140ad84e20c8e987eeae6").getBytes()));
Output:
[55, 56, 98, 52, 97, 48, 50, 102, 97, 49, 51, 57, 97, 50, 57, 52, 52, 102, 49, 55, 98, 52, 101, 100, 99, 50, 50, 102, 98, 49, 55, 53, 58, 56, 57, 48, 55, 102, 51, 99, 52, 56, 54, 49, 49, 52, 48, 97, 100, 56, 52, 101, 50, 48, 99, 56, 101, 57, 56, 55, 101, 101, 97, 101, 54]
ObjC
NSString *str = #"78b4a02fa139a2944f17b4edc22fb175:8907f3c4861140ad84e20c8e987eeae6";
NSData *bytes = [str dataUsingEncoding:NSISOLatin1StringEncoding allowLossyConversion:NO];
NSLog(#"%#", [bytes description]);
Output:
<37386234 61303266 61313339 61323934 34663137 62346564 63323266 62313735 3a383930 37663363 34383631 31343061 64383465 32306338 65393837 65656165 36>
I've tried using different charsets with no luck and can't think of any other reasons why the bytes would be different. Any ideas? I did notice that all of the byte values are different by some factor of 18 but am not sure what is causing it.
Actually, Java is printing in decimal, byte by byte. Obj C is printing in hex, integer by integer.
Referring this chart:
Dec Hex
55 37
56 38
98 62
...
You'll just have to find a way to output byte by byte in Obj C.
I don't know about Obj C, but if that NSLog function works similar to printf() in C, I'd start with that.
A code snippet from Apple
unsigned char aBuffer[20];
NSString *myString = #"Test string.";
const char *utfString = [myString UTF8String];
NSData *myData = [NSData dataWithBytes: utfString length: strlen(utfString)];
[myData getBytes:aBuffer length:20];
The change in bytes can be due to Hex representation. The above code shows how to convert the string to bytes and store the result in a buffer.

Java get string from byte array

I am modding a java program and in it a handler receives 2 byte arrays
When I print those arrays using a line of code like this\
java.util.Arrays.toString(this.part1))
I get an output like this
[43, 83, 123, 97, 104, -10, -4, 124, -113, -56, 118, -23, -25, -13, -9, -85, 58, -66, -34, 38, -55, -28, -40, 125, 22, -83, -72, -93, 73, -117, -59, 72, 105, -17, 3, -53, 121, -21, -19, 103, 101, -71, 54, 37...
I know these byte arrays contain a string. How might I get that string from them?
Here is the code
public void readPacketData(PacketBuffer data) throws IOException
{
this.field_149302_a = data.readByteArray();
this.field_149301_b = data.readByteArray();
String packet1 = (java.util.Arrays.toString(this.field_149302_a));
String packet2 = (java.util.Arrays.toString(this.field_149301_b));
}
In order to convert Byte array into String format correctly, we have to explicitly create a String object and assign the Byte array to it. You can try this:
String str = new String(this.part1, "UTF-8"); //for UTF-8 encoding
System.out.println(str);
Please note that the byte array contains characters in a special encoding (that you must know).
String has a constructor from byte[], so you could just call new String(this.part1), or, if the bytes do not represent a string in the platform's default charster, use the overloaded flavor and pass the charset too.
actually to convert bytes to String you need encoding name. You need to change UTF-8 to correct encoding name in first answer to avoid wrong output, try UTF-16 or one of https://docs.oracle.com/javase/8/docs/technotes/guides/intl/encoding.doc.html (try to choose by your locale).

Categories