I am having a bytearray of byte[] type having the length 17 bytes, i want to convert this to string and want to give this string for another comparison but the output i am getting is not in the format to validate, i am using the below method to convert.I want to output as string which is easy to validate and give this same string for comparison.
byte[] byteArray = new byte[] {0,127,-1,-2,-54,123,12,110,89,0,0,0,0,0,0,0,0};
String value = new String(byteArray);
System.out.println(value);
Output : ���{nY
What encoding is it? You should define it explicitly:
new String(byteArray, Charset.forName("UTF-32")); //or whichever you use
Otherwise the result is unpredictable (from String.String(byte[]) constructor JavaDoc):
Constructs a new String by decoding the specified array of bytes using the platform's default charset
BTW I have just tried it with UTF-8, UTF-16 and UTF-32 - all produce bogus results. The long series of 0 makes me believe that this isn't actually a text. Where do you get this data from?
UPDATE: I have tried it with all character sets available on my machine:
for (Map.Entry<String, Charset> entry : Charset.availableCharsets().entrySet())
{
final String value = new String(byteArray, entry.getValue());
System.out.println(entry.getKey() + ": " + value);
}
and no encoding produces anything close to human-readable text... Your input is not text.
Use as follows:
byte[] byteArray = new byte[] {0,127,-1,-2,-54,123,12,110,89,0,0,0,0,0,0,0,0};
String value = Arrays.toString(byteArray);
System.out.println(value);
Your output will be
[0,127,-1,-2,-54,123,12,110,89,0,0,0,0,0,0,0,0]
Is it actually encoded text? If so, specify the encoding.
However, the data you've got doesn't look like it's actually meant to be text. It just looks like arbitrary binary data to me. If it isn't really text, I'd recommend converting it to hex or base64, depending on requirements. There's a good public domain base64 encoder you can use.
String text = Base64.encodeBytes(byteArray);
And decoding:
byte[] data = Base64.decode(text):
not 100% sure if I get you right. Is this what you want?
String s = null;
StringBuffer buf = new StringBuffer("");
byte[] byteArray = new byte[] {0,127,-1,-2,-54,123,12,110,89,0,0,0,0,0,0,0,0};
for(byte b : byteArray) {
s = String.valueOf(b);
buf.append(s + ",");
}
String value = new String(buf);
System.out.println(value);
Maybe you should specify a charset:
String value = new String(byteArray, "UTF-8");
Related
I have written the simple conversion code to convert to Japanese character from UTF-8.
private static String convertUTF8ToShiftJ(String uft8Strg) {
String shftJStrg = null;
try {
byte[] b = uft8Strg.getBytes(UTF_8);
shftJStrg = new String(b, Charset.forName("SHIFT-JIS"));
logger.info("Converted to the string :" + shftJStrg);
} catch (Exception e) {
e.printStackTrace();
return uft8Strg;
}
return shftJStrg;
}
But it gives the output error,
convertUTF8ToShiftJ START !!
uft8Strg=*** abc000.sh ����started�
*** abc000.sh å®�è¡�ä¸ï¼�executing...ï¼�
*** abc000.sh ����ended��*
Do anybody have any idea that where I made a mistake or need some additional logic, it would be really helpful!
You String is already a String, so your method is "wrong". UTF8 is an encoding that is a byte[] and can be converted to a String in Java.
It should read:
private static byte[] convertUTF8ToShiftJ(byte[] uft8) {
If you want to convert UTF8 byte[] to JIS byte[]:
private static byte[] convertUTF8ToShiftJ(byte[] uft8) {
String s = new String(utf8, StandardCharsets.UTF_8);
return s.getBytes( Charset.forName("SHIFT-JIS"));
}
A String can be converted to a byte[] later, by mystring.getBytes(encoding)
Please see The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!) for more detail.
It seems you have a conceptual misunderstanding about String encodings.
See for example Byte Encodings and Strings.
Converting a String from one encoding to another encoding doesn't make sense,
because String is a thing independent of encoding.
However, a String can be represented by byte arrays in various encodings
(like for example UTF-8 or Shift-JIS).
Therefore, it would make sense to convert a UTF-8 encoded byte array
to a Shift-JIS encoded byte array.
private static byte[] convertUTF8ToShiftJ(byte[] utf8Bytes) throws IllegalCharsetNameException {
String s = new String(utf8Bytes, StandardCharsets.UTF_8);
byte[] shftJBytes = s.getBytes(Charset.forName("SHIFT-JIS"));
return shftJBytes;
}
I have this string
"=?UTF-8?B?VGLNBGNDQA==?="
to decode in a standard java String.
I wrote this quick and dirty main to get the String, but I'm having troubles
String s = "=?UTF-8?B?VGLNBGNDQA==?=";
s = s.split("=\\?UTF-8\\?B\\?")[1].split("\\?=")[0];
System.out.println(s);
byte[] decoded = Base64.getDecoder().decode(s);
String x = new String(decoded, "UTF8");
System.out.println(decoded);
System.out.println(x);
It is actually printing a strange string
"Tb�cC#"
I do not know what is the text behind the encoded string, but I can assume my program works, since I can convert without problems any other encoded string, for example
"=?UTF-8?B?SGlfR3V5cyE="
That is "Hi_Guys!".
Should I assume that string is malformed?
My application get the String representation of bytes. I need to convert it byte[] array. I am using below code but it is not working.
byte[] bytesArray = myString.getBytes();
Can anyone help what is the correct way to convert it to byte[].
EDIT:
hi all, My code is here http://pastebin.com/87jGprtD/. I have one base64 code. This base64 has content for text and imagedata both. I want to download/create an image from this code. When I decode I get the byte[] for both text and imagedata. I convert it string because I have to differentiate the each part. I used spilt with some delimiter now i have an array of string. This string contains the imagedata. I have to convert it back to bytes to create an image. please check code for the same. please
Here is the relevant code:
byte[] imageByteArray = Base64.decodeBase64(imageDataString);
System.out.println(new String(imageByteArray));
String[] contentArray = new String(imageByteArray).split("--1_520B30B0_E358708");
for (int i = 0; i < contentArray.length; i++) {
if (i == 2) {
String[] parts = contentArray[i].split("binary");
InputStream is = new ByteArrayInputStream((parts[1].trim()).getBytes());
ImageInputStream iis = ImageIO.createImageInputStream(is);
System.out.println(iis);
image = ImageIO.read(iis);
ImageIO.write(image, "JPG", new File("E:/test1.JPG"));
}
}
You are decoding Base64 data into byte[], then converting that to String. You can't do that -- "binary" data cannot be converted to String and back to "binary" without data loss.
My code:
private static String convertToBase64(String string)
{
final byte[] encodeBase64 =
org.apache.commons.codec.binary.Base64.encodeBase64(string
.getBytes());
System.out.println(Hex.encodeHexString(encodeBase64));
final byte[] data = string.getBytes();
final String encoded =
javax.xml.bind.DatatypeConverter.printBase64Binary(data);
System.out.println(encoded);
return encoded;
}
Now I'm calling it: convertToBase64("stackoverflow"); and get following result:
6333526859327476646d56795a6d787664773d3d
c3RhY2tvdmVyZmxvdw==
Why I get different results?
I think Hex.encodeHexString will encode your String to hexcode, and the second one is a normal String
From the API doc of Base64.encodeBase64():
byte[] containing Base64 characters in their UTF-8 representation.
So instead
System.out.println(Hex.encodeHexString(encodeBase64));
you should write
System.out.println(new String(encodeBase64, "UTF-8"));
BTW: You should never use the String.getBytes() version without explicit encoding, because the result depends on the default platform encoding (for Windows this is usually "Cp1252" and Linux "UTF-8").
For converting a string, I am converting it into a byte as follows:
byte[] nameByteArray = cityName.getBytes();
To convert back, I did: String retrievedString = new String(nameByteArray); which obviously doesn't work. How would I convert it back?
What characters are there in your original city name? Try UTF-8 version like this:
byte[] nameByteArray = cityName.getBytes("UTF-8");
String retrievedString = new String(nameByteArray, "UTF-8");
which obviously doesn't work.
Actually that's exactly how you do it. The only thing that can go wrong is that you're implicitly using the platform default encoding, which could differ between systems, and might not be able to represent all characters in the string.
The solution is to explicitly use an encoding that can represent all characts, such as UTF-8:
byte[] nameByteArray = cityName.getBytes("UTF-8");
String retrievedString = new String(nameByteArray, "UTF-8");