I have following data in a file:
I want to decode the UserData. On reading it as string comment, I'm doing following:
String[] split = comment.split("=");
if(split[0].equals("UserData")) {
System.out.println(split[1]);
byte[] callidArray = Arrays.copyOf(java.util.Base64.getDecoder().decode(split[1]), 9);
System.out.println("UserData:" + Hex.encodeHexString(callidArray).toString());
}
But I'm getting the following exception:
java.lang.IllegalArgumentException: Illegal base64 character 1
What could be the reason?
The image suggests that the string you are trying to decode contains characters like SOH and BEL. These are ASCII control characters, and will not ever appear in a Base64 encoded string.
(Base64 typically consists of letters, digits, and +, \ and =. There are some variant formats, but control characters are never included.)
This is confirmed by the exception message:
java.lang.IllegalArgumentException: Illegal base64 character 1
The SOH character has ASCII code 1.
Conclusions:
You cannot decode that string as if it was Base64. It won't work.
It looks like the string is not "encoded" at all ... in the normal sense of what "encoding" means in Java.
We can't advise you on what you should do with it without a clear explanation of:
where the (binary) data comes from,
what you expected it to contain, and
how you read the data and turned it into a Java String object: show us the code that did that!
The UserData field in the picture in the question actually contains Bytes representation of Hexadecimal characters.
So, I don't need to decode Base64. I need to copy the string to a byte array and get equivalent hexadecimal characters of the byte array.
String[] split = comment.split("=");
if(split[0].equals("UserData")) {
System.out.println(split[1]);
byte[] callidArray = Arrays.copyOf(split[1].getBytes(), 9);
System.out.println("UserData:" + Hex.encodeHexString(callidArray).toString());
}
Output:
UserData:010a20077100000000
Related
I'm writing a REST client from a C# usage example. Now i need to convert a string in the proper format but can't find the equivalent method on Java.
original:
string Credentials = Convert.ToBase64String(ASCIIEncoding.ASCII.GetBytes(string);
At this point I've done this:
String Credentials = new String(DatatypeConverter.parseBase64Binary(String));
but i still need the ASCII conversion and I'm not sure that the things I've fount will work fine, like: Convert character to ASCII numeric value in java
any clues?
Thank you.
If you're using java 8 you should take a look at its new Base64 class. It will provide you with a Base64.Encoder whose encodeToString(byte[] src) method accepts a byte array and return a base64 encoded String.
String base64 = Base64.getEncoder().encodeToString("I'm a String".getBytes());
System.out.println(base64); // prints SSdtIGEgU3RyaW5n
I have a string that is base64 encoded. It looks like this:
eyJibGExIjoiYmxhMSIsImJsYTIiOiJibGEyIn0=
Any online tool can decode this to the proper string which is {"bla1":"bla1","bla2":"bla2"}. However, my Java implementation fails:
import java.util.Base64;
System.out.println("payload = " + payload);
String json = new String(Base64.getDecoder().decode(payload));
I'm getting the following error:
payload = eyJibGExIjoiYmxhMSIsImJsYTIiOiJibGEyIn0=
java.lang.IllegalArgumentException: Input byte array has incorrect ending byte at 40
What is wrong with my code?
Okay, I found out. The original String is encoded on an Android device using android.util.Base64 by Base64.encodeToString(json.getBytes("UTF-8"), Base64.DEFAULT);. It uses android.util.Base64.DEFAULT encoding scheme.
Then on the server side when using java.util.Base64 this has to be decoded with Base64.getMimeDecoder().decode(payload) not with Base64.getDecoder().decode(payload)
I was trying to use the strings from the args. I found that if I use arg[0].trim() that it made it work. eg
Base64.getDecoder().decode(arg[0].trim());
I guess there's some sort of whitespace that gets it messed up.
Maybe too late, but I also had this problem.
By default, the Android Base64 util adds a newline character to the end of the encoded string.
The Base64.NO_WRAP flag tells the util to create the encoded string without the newline character.
Your android app should encode src something like this:
String encode = Base64.encodeToString(src.getBytes(), Base64.NO_WRAP);
I'm trying to use extended ascii character 179(looks like pipe).
Here is how I use it.
String cmd = "";
char pipe = (char) 179;
// cmd ="02|CO|0|101|03|0F""
cmd ="02"+pipe+"CO"+pipe+"0"+pipe+"101"+pipe+"03"+pipe+"0F";
System.out.println("cmd "+cmd);
Output
cmd 02³CO³0³101³03³0F
But the output is like this . I have read that extended ascii characters are not displayed correctly.
Is my code correct and just the ascii is not correctly displayed
or my code is wrong.
I'm not concerned about showing this string to user I need to send it to server.
EDIT
The vendor's api document states that we need to use ascii 179 (looks like pipe) . The server side code needs 179(part of extended ascii) as pipe/vertical line so I cannot use 124(pipe)
EDIT 2
Here is the table for extended ascii
On the other hand this table shows that ascii 179 is "3" . Why
are there different interpretation of the same and which one should I
consider??
EDIT 3
My default charset value is (is this related to my problem?)
System.out.println("Default Charset=" + Charset.defaultCharset());
Default Charset=windows-1252
Thanks!
I have referred to
How to convert a char to a String?
How to print the extended ASCII code in java from integer value
Thanks
Use the below code.
String cmd = "";
char pipe = '\u2502';
cmd ="02"+pipe+"CO"+pipe+"0"+pipe+"101"+pipe+"03"+pipe+"0F";
System.out.println("cmd "+cmd);
System.out.println("int value: " + (int)pipe);
Output:
cmd 02│CO│0│101│03│0F
int value: 9474
I am using IntelliJ. This is the output I am getting.
Your code is correct; concatenating String values and char values does what one expects. It's the value of 179 that is wrong. You can google "unicode 179", and you'll find "Unicode Character 'SUPERSCRIPT THREE' (U+00B3)", as one might expect. And, you could simply say "char pipe = '|';" instead of using an integer. Or even better: String pipe = "|"; which also allows you the flexibility to use more than one character :)
In response to the new edits...
May I suggest that you fix this rather low-level problem not at the Java String level, but instead replace the byte encoding this character before sending the bytes to the server?
E.g. something like this (untested)
byte[] bytes = cmd.getBytes(); // all ascii, so this should be safe.
for (int i = 0; i < bytes.length; i++) {
if (bytes[i] == '|') {
bytes[i] = (byte)179;
}
}
// send command bytes to server
// don't forget endline bytes/chars or whatever the protocol might require. good luck :)
Currently incorporating the URLEncoder and URLDecoder into some code.
There are numerous URLs already saved that will get processed by the URLDecoder routine that was not initially processed by the URLEncoder routine.
Based on some testing it doesn't appear there will be an issue, but granted I have not tested all the scenarios.
I did notice some characters like the / which would normally get encoded are processed just find by the decoding routine even if not initially encoded.
This lead me to an oversimplified analysis. It appears the URLDecoder routine essentially checks the URL for a % and the next 2 bytes (provided UTF-8 is used). As long as there aren't any % within the previously saved off URLs then there shouldn't be an issue when processed by the URLDecoder routine. Does that sound about right?
Yes, while it will work for "simple" cases, you might encounter a) exceptions or b) unexpected behaviour if calling URLDecoder.decode for an unencoded URL that contains certain special chars.
Consider the following example: It will throw a java.lang.IllegalArgumentException: URLDecoder: Incomplete trailing escape (%) pattern for the third test and it will alter the URL without exception for the second test (while the regular encoding/decoding works without issues):
import java.net.URLDecoder;
import java.net.URLEncoder;
public class Test {
public static void main(String[] args) throws Exception {
test("http://www.foo.bar/");
test("http://www.foo.bar/?q=a+b");
test("http://www.foo.bar/?q=äöüß%"); // Will throw exception
}
private static void test(String url) throws Exception {
String encoded = URLEncoder.encode(url, "UTF-8");
String decoded = URLDecoder.decode(encoded, "UTF-8");
System.out.println("encoded: " + encoded);
System.out.println("decoded: " + decoded);
System.out.println(URLDecoder.decode(decoded, "UTF-8"));
}
}
Output (notice how the + sign disappears):
encoded: http%3A%2F%2Fwww.foo.bar%2F
decoded: http://www.foo.bar/
http://www.foo.bar/
encoded: http%3A%2F%2Fwww.foo.bar%2F%3Fq%3Da%2Bb
decoded: http://www.foo.bar/?q=a+b
http://www.foo.bar/?q=a b
encoded: http%3A%2F%2Fwww.foo.bar%2F%3Fq%3D%C3%A4%C3%B6%C3%BC%C3%9F%25
decoded: http://www.foo.bar/?q=äöüß%
Exception in thread "main" java.lang.IllegalArgumentException: URLDecoder: Incomplete trailing escape (%) pattern
at java.net.URLDecoder.decode(Unknown Source)
at Test.test(Test.java:16)
See the javadoc of URLDecoder for the two cases as well:
The plus sign "+" is converted into a space character " " .
A sequence of the form "%xy" will be treated as representing a byte where xy is the two-digit hexadecimal representation of the 8 bits.
Then, all substrings that contain one or more of these byte sequences
consecutively will be replaced by the character(s) whose encoding
would result in those consecutive bytes. The encoding scheme used to
decode these characters may be specified, or if unspecified, the
default encoding of the platform will be used.
If you are sure that your unencoded URLs do not contain + or % then I'd say it's safe to call URLDecoder.decode. Otherwise I'd advise to implement additional checks, e.g. try to decode and compare with the original (cf. this question on SO).
Here is what I was doing -
Take up a document(JSON) from mongodb
Write this key value as an XML
Send this XML to Apache Solr for indexing
Here is how I was doing step #2
Given key say "key1" and value as "value1" step#2 output is
"<"+ key1 + ">" + value1 + "</"+ key1 + ">"
Now when i send this XML to Solr, I was getting Stax exceptions like -
Invalid UTF-8 start byte 0xb7
Invalid UTF-8 start byte 0xa0
Invalid UTF-8 start byte 0xb0
Invalid UTF-8 start byte 0x96
So here is how I am thinking to fix it -
key1New = new String(key1.getBytes("UTF-8"), "UTF-8");
value1New = new String(value1.getBytes("UTF-8"), "UTF-8");
Should this work OR I should rather do this -
key1New = new String(key1.getBytes("UTF-8"), "ISO-8859-1");
value1New = new String(value1.getBytes("UTF-8"), "ISO-8859-1");
Java String Objects dont have encodings. An encoding, in this context, makes sense when associated with a byte[]. try something like this:
byte[] utf8xmlBytes = originalxmlString.getBytes("UTF8");
and send these bytes.
EDIT: Also, consider the comment of Jon Skeet. It is usually a good idea to create XML using an API unless you have a very small amount of XML.