byte[] to string and back to byte[] - java

I have a problem with interpreting a file. The file is builded as follow:
"name"-#-"date"-#-"author"-#-"signature"
The signature is a byte array. When i read the file back in i parse it to String en split it:
myFileInpuStream.read(fileContent);
String[] data = new String(fileContent).split("-#-");
If i look at the var fileContent i see that the bytes are al good.
But when i try to get the signature byte array:
byte[] signature= data[3].getBytes();
Sometimes i get wrong values of 63. I tried a few solutions with:
new String(fileContent, "UTF-8")
But no luck. Can someone help?
The signature is not a fixed length thus i can not do it hard coded...
Some extra info:
Original signature:
[48, 45, 2, 21, 0, -123, -3, -5, -115, 84, -86, 26, -124, -112,
75, -10, -1, -56, 40, 13, -46, 6, 120, -56, 100, 2, 20, 66, -92, -8,
48, -88, 101, 57, 56, 20, 125, -32, -49, -123, 73, 96, 76, -82, 81,
51, 69]
filecontent(var after reading):
... 48, 45, 2, 21, 0, -123, -3, -5, -115, 84, -86, 26, -124, -112,
75, -10, -1, -56, 40, 13, -46, 6, 120, -56, 100, 2, 20, 66, -92, -8,
48, -88, 101, 57, 56, 20, 125, -32, -49, -123, 73, 96, 76, -82, 81,
51, 69]
signature (after split and getBytes()):
[48, 45, 2, 21, 0, -123, -3, -5, 63, 84, -86, 26, -124, 63, 75,
-10, -1, -56, 40, 13, -46, 6, 120, -56, 100, 2, 20, 66, -92, -8, 48, -88, 101, 57, 56, 20, 125, -32, -49, -123, 73, 96, 76, -82, 81, 51, 69]

You can't access data[4] because you have 4 String in your table. So you can access data from 0 to 3.
data[0] = name
data[1] = date
data[2] = author
data[3] = signature
The solution :
byte[] signature = data[3].getBytes();

Edit: I think I finally understand what you are doing.
You have four parts: name, date, author, signature. The name and author are strings, the date is a date and the signature is a hashed or encrypted array of bytes. You want to store them as text in a file, separated by -#-. To do this, you first need to convert each to a valid string. Name and author are already strings. Converting a date to string is easy. Converting an array of bytes to string is not easy.
You can use base64 encoding to convert a byte array to a string. Use javax.xml.bind.DatatypeConverter printBase64Binary() for encoding and javax.xml.bind.DatatypeConverter parseBase64Binary() for decoding.
For example, if you have a name denBelg, date 2013-03-19, author Virtlink and this signature:
30 2D 02 15 00 85 FD FB 8D 54 AA 1A 84 90 4B F6 FF C8 28 0D D2 06 78 C8 64 02 14
42 A4 F8 30 A8 65 39 38 14 7D E0 CF 85 49 60 4C AE 51 33 45
Then, after concatenation and base64 encoding of the signature, the resulting string became, for example:
denBelg-#-20130319-#-Virtlink-#-MC0CFQCF/fuNVKoahJBL9v/IKA3SBnjIZAIUQqT4MKhlOTgUfeDPhUlgTK5RM0U=
Later, when you split the string on -#- you can decode the base64 signature part and get back an array of bytes.
Note that when the name or author can include -#- in their name, they can mess up your code. For example, if I set name as den-#-Belg then your code would fail.
Original post:
Java's String.getBytes() uses the platform default encoding for the string. Encoding is the way string characters are mapped to bytes values. So, depending on the platform the resulting bytes may be different.
Fix the encoding to UTF-8 and read it with the same encoding, and your problems will go away.
byte[] signature = data[3].getBytes("UTF-8");
String sigdata = new String(signature, "UTF-8");
0-???����T�?��K���(
�?x�d??B��0�e98?}�υI`L�Q3E
Your example represents some garbled mess of characters (is it encrypted or something?), but the bytes you highlighted show the problem:
You start with a byte value of -115. The minus indicates it is a byte value above 0x7F, whose character representation highly depends on the encoding used. Let's assume extended US-ASCII, then your byte represents (according to this table) the character ì (with an accent). Now when you decode it the decoder (depending on the encoding you use) might not understand the byte value 0x8D and instead represents it with a question mark ?. Note that the question mark is US-ASCII character 63, and that's where your 63 came from.
So make sure you use your encodings consistently and don't rely on the system's default.
Also, never use string encoding to decode byte arrays that do not represent strings (e.g. hashes or other cryptographic content).
According to your comment you are trying to read encrypted data (which are bytes) and converting them to a string using a decoder? That will never work in any way you expect it to. After you've encrypted something you have an array of bytes which you should store as-is. When you read them back, you have to put the bytes through a decrypter to regain the unencrypted bytes. Only if those decrypted bytes represent a string, then you can use an encoding to decode the string.

You're making extra work for yourself by converting these bytes into Strings by hand. Why aren't you doing it using the classes intended for this?
// get the file /logs/access.log
Path path = FileSystems.getRoot().getPath("logs", "access.log");
// open it, decoding UTF-8
BufferReader reader = Files.newBufferedReader(path, StandardCharsets.UTF_8);
// read a line of text, properly decoded
String line = reader.readLine();
Or, if you're in Java 6:
BufferedReader reader = new BufferedReader(new InputStreamReader(new FileInputStream("/logs/access.log"), "UTF-8"));
String line = reader.readLine();
Links:
Files.newBufferedReader
InputStreamReader

Sounds like an encoding issue to me.
First you need to know what encoding your file is using, and use that when reading the file.
Secondly, you say you signature is a byte array, but java strings are always unicode. If you want a different encoding (I'm guessing you want ASCII), you need to do getBytes("US-ASCII").
Of course, if your input was ascii, it would be strange that this could cause encoding issues.

Related

Retrieving bytes of String returns different results in ObjC than Java

I've got a string that I'm trying to convert to bytes in order to create an md5 hash in both ObjC and Java. For some reason, the bytes are different between the two languages.
Java
System.out.println(Arrays.toString(
("78b4a02fa139a2944f17b4edc22fb175:8907f3c4861140ad84e20c8e987eeae6").getBytes()));
Output:
[55, 56, 98, 52, 97, 48, 50, 102, 97, 49, 51, 57, 97, 50, 57, 52, 52, 102, 49, 55, 98, 52, 101, 100, 99, 50, 50, 102, 98, 49, 55, 53, 58, 56, 57, 48, 55, 102, 51, 99, 52, 56, 54, 49, 49, 52, 48, 97, 100, 56, 52, 101, 50, 48, 99, 56, 101, 57, 56, 55, 101, 101, 97, 101, 54]
ObjC
NSString *str = #"78b4a02fa139a2944f17b4edc22fb175:8907f3c4861140ad84e20c8e987eeae6";
NSData *bytes = [str dataUsingEncoding:NSISOLatin1StringEncoding allowLossyConversion:NO];
NSLog(#"%#", [bytes description]);
Output:
<37386234 61303266 61313339 61323934 34663137 62346564 63323266 62313735 3a383930 37663363 34383631 31343061 64383465 32306338 65393837 65656165 36>
I've tried using different charsets with no luck and can't think of any other reasons why the bytes would be different. Any ideas? I did notice that all of the byte values are different by some factor of 18 but am not sure what is causing it.
Actually, Java is printing in decimal, byte by byte. Obj C is printing in hex, integer by integer.
Referring this chart:
Dec Hex
55 37
56 38
98 62
...
You'll just have to find a way to output byte by byte in Obj C.
I don't know about Obj C, but if that NSLog function works similar to printf() in C, I'd start with that.
A code snippet from Apple
unsigned char aBuffer[20];
NSString *myString = #"Test string.";
const char *utfString = [myString UTF8String];
NSData *myData = [NSData dataWithBytes: utfString length: strlen(utfString)];
[myData getBytes:aBuffer length:20];
The change in bytes can be due to Hex representation. The above code shows how to convert the string to bytes and store the result in a buffer.

Java get string from byte array

I am modding a java program and in it a handler receives 2 byte arrays
When I print those arrays using a line of code like this\
java.util.Arrays.toString(this.part1))
I get an output like this
[43, 83, 123, 97, 104, -10, -4, 124, -113, -56, 118, -23, -25, -13, -9, -85, 58, -66, -34, 38, -55, -28, -40, 125, 22, -83, -72, -93, 73, -117, -59, 72, 105, -17, 3, -53, 121, -21, -19, 103, 101, -71, 54, 37...
I know these byte arrays contain a string. How might I get that string from them?
Here is the code
public void readPacketData(PacketBuffer data) throws IOException
{
this.field_149302_a = data.readByteArray();
this.field_149301_b = data.readByteArray();
String packet1 = (java.util.Arrays.toString(this.field_149302_a));
String packet2 = (java.util.Arrays.toString(this.field_149301_b));
}
In order to convert Byte array into String format correctly, we have to explicitly create a String object and assign the Byte array to it. You can try this:
String str = new String(this.part1, "UTF-8"); //for UTF-8 encoding
System.out.println(str);
Please note that the byte array contains characters in a special encoding (that you must know).
String has a constructor from byte[], so you could just call new String(this.part1), or, if the bytes do not represent a string in the platform's default charster, use the overloaded flavor and pass the charset too.
actually to convert bytes to String you need encoding name. You need to change UTF-8 to correct encoding name in first answer to avoid wrong output, try UTF-16 or one of https://docs.oracle.com/javase/8/docs/technotes/guides/intl/encoding.doc.html (try to choose by your locale).

Java AES CBC Decryption First Block

I've got a big problem with AES Cryptography between Java and C++ (CryptoPP to be specific), that I was expecting to be way easier than asymetric cryptography, that I managed to solve earlier.
When I'm decrypting 48 bytes and the result is byte[] array of 38 bytes (size + code + hashOfCode), the last 22 bytes are decrypted properly and the first 16 are wrong.
try {
cipher = Cipher.getInstance("AES/CBC/PKCS5Padding", "BC");
byte[] key = { 107, -39, 87, -65, -1, -28, -85, -94, 105, 76, -94,
110, 48, 116, -115, 86 };
byte[] vector = { -94, 112, -23, 93, -112, -58, 18, 78, 1, 69, -92,
102, 33, -96, -94, 59 };
SecretKey aesKey = new SecretKeySpec(key, "AES");
byte[] message = { 32, -26, -72, 25, 63, 114, -58, -5, 4, 90, 54,
88, -28, 3, -72, 25, -54, -60, 17, -53, -27, -91, 34, -101,
-93, -3, -47, 47, -12, -35, -118, -122, -77, -7, -9, -123,
7, -66, 10, -93, -29, 4, -60, -102, 16, -57, -118, 94 };
IvParameterSpec aesVector = new IvParameterSpec(vector);
cipher.init(Cipher.DECRYPT_MODE, aesKey, aesVector);
byte[] wynik = cipher.doFinal(message);
Log.d("Solution here", "Solution");
for (byte i : wynik)
Log.d("Solution", "" + i);
} catch (Exception e) {
Log.d("ERROR", "TU");
e.printStackTrace();
}
Decrypted message, that I'm expecting to get is:
0 0 0 32 10 0 16 43 81 -71 118 90 86 -93 -24 -103 -9 -49 14 -29 -114 82 81 -7 -59 3 -77 87 -77 48 -92 -111 -125 -21 123 21 86 4
But what I'm getting is
28 127 -111 92 -75 26 18 103 79 13 -51 -60 -60 -44 18 126 -9 49 14 -29 -114 82 81 -7 -59 3 -77 87 -77 48 -92 -111 -125 -21 123 21 86 4
As you can see only last 22 bytes are the same.
I know that AES works with blocks and so I was thinking that maybe something with initialization vector is wrong (because only the first block is broken), but as you can see I'm setting vector in the way I think is OK.
And I have no idea why is it working that way. Any help will be really appreciated, cause I'm running out of time.
[EDIT]
I add the Cipher initialization. As you wrote, it is AES/CBC/PKCS5Padding.
On the CryptoPP/C++ side (that is in fact not my code, so I'd provide the least piece of information that I can find useful) there is:
CryptoPP::CBC_Mode< CryptoPP::AES>::Encryption m_aesEncryption;
CryptoPP::CBC_Mode< CryptoPP::AES>::Decryption m_aesDecryption;
QByteArray AESAlgorithmCBCMode::encrypt(const QByteArray& plain)
{
std::string encrypted;
try {
StringSource(reinterpret_cast<const byte*>(plain.data()), plain.length(), true,
new StreamTransformationFilter(m_aesEncryption,
new StringSink(encrypted)));
} catch (const CryptoPP::Exception& e) {
throw SymmetricAlgorithmException(e.what());
}
return QByteArray(encrypted.c_str(), encrypted.length());
}
QByteArray AESAlgorithmCBCMode::decrypt(const QByteArray& encrypted)
{
std::string plain;
try {
StringSource(reinterpret_cast<const byte*>(encrypted.data()), encrypted.length(), true,
new StreamTransformationFilter(m_aesDecryption,
new StringSink(plain)));
} catch (const CryptoPP::Exception& e) {
throw SymmetricAlgorithmException(e.what());
}
return QByteArray(plain.c_str(), plain.length());
}
Key and initialization vector are exactly the same (I checked).
The fun part is that is a part of a bigger communication protocol, and the previous message was encrypted and decrypted perfectly fine. And there were also zeros at the beginning.
The answer was provided in the question; that didn't change even after a clear comment that it should be posted as an answer.
This is said answer:
The point is that every time doFinal() is invoked, it resets the state of cipher. What you should do is store last block of message (encrypted for Decryptor and decrypted for Encryptor) that will be used next time as a new InitializationVector. Then init() with this new IV should be invoked. Naturally, different instances of Cipher for Encryption and Decryption should be provided.

Different results on Oracle JRE and Dalvik JVM

I'm stuck while creating a licence manager for an Android app where licence key is generated on desktop server, and verification code runs on android devices. The verification code when executed on desktop produces desired results, but the same code produces a different result on Android.
I debugged the problem and reached the point where the results were getting snapped!
here is a code snippet to demonstrate the difference:
byte[] bytes = {-88, 50, -29, 114, 51, 88, 38, -52, 114, 91, -23, -55, 124, 37, -90, -49, 36, -110, -67, -59, -33, -75, 85, -72, -109, 25, -54, 89, 6, 35, -50, -11, -87, -22, 33, -2, 55, -30, 75, -36, -40, -29, -103, 110, 46, -100, -68, 101, -105, 62, 53, -20, -20, -21, -118, -72, -27, 32, 59, 127, 15, -117, 6, 102};
System.out.println(new String(bytes, "UTF-8").hashCode());
on oracle jdk the result comes out to be
-24892055
but on android phone the result is:
-186036018
Any help will be appreciated.
When you call getBytes() you need to specify an ecoding there as well, otherwise you'll get the default encoding from the OS, which could be anything, e.g. showBytes(new String(bytes, "UTF-8").getBytes("UTF-8"));
It's a difference in how Android and Java handle malformed UTF-8. Given the four byte sequence 0xf5 0xa9 0xea 0x21, Android returns two Unicode replacement characters (0xfffd). Oracle's class library returns three Unicode replacement characters.
Here's a simpler example that demonstrates the problem.
byte[] bytes = { (byte) 0xf5, (byte) 0xa9, (byte) 0xea, (byte) 0x21 };
String decoded = new String(bytes, "UTF-8");
for (int i = 0; i < decoded.length(); i++) {
System.out.print(Integer.toHexString(decoded.charAt(i)) + " ");
}
Oracle's JVM prints
fffd fffd fffd
Android's dalvikvm prints
fffd fffd
Your best bet is to avoid decoding byte sequences using UTF-8 unless you know that they are in fact UTF-8. I've reported this inconsistency to the Dalvik team to investigate: Android bug 23831.
If you use CharsetDecoder, Android uses icu4c to do the conversion. That returns U+fffd U+fffd U+0021, which also seems correct by my reading of the UTF-8 spec. In future releases, Android's String will match Android's CharsetDecoder 2.

Java Byte Array to String to Byte Array

I'm trying to understand a byte[] to string, string representation of byte[] to byte[] conversion... I convert my byte[] to a string to send, I then expect my web service (written in python) to echo the data straight back to the client.
When I send the data from my Java application...
Arrays.toString(data.toByteArray())
Bytes to send..
[B#405217f8
Send (This is the result of Arrays.toString() which should be a string representation of my byte data, this data will be sent across the wire):
[-47, 1, 16, 84, 2, 101, 110, 83, 111, 109, 101, 32, 78, 70, 67, 32, 68, 97, 116, 97]
On the python side, the python server returns a string to the caller (which I can see is the same as the string I sent to the server
[-47, 1, 16, 84, 2, 101, 110, 83, 111, 109, 101, 32, 78, 70, 67, 32, 68, 97, 116, 97]
The server should return this data to the client, where it can be verified.
The response my client receives (as a string) looks like
[-47, 1, 16, 84, 2, 101, 110, 83, 111, 109, 101, 32, 78, 70, 67, 32, 68, 97, 116, 97]
I can't seem to figure out how to get the received string back into a
byte[]
Whatever I seem to try I end up getting a byte array which looks as follows...
[91, 45, 52, 55, 44, 32, 49, 44, 32, 49, 54, 44, 32, 56, 52, 44, 32, 50, 44, 32, 49, 48, 49, 44, 32, 49, 49, 48, 44, 32, 56, 51, 44, 32, 49, 49, 49, 44, 32, 49, 48, 57, 44, 32, 49, 48, 49, 44, 32, 51, 50, 44, 32, 55, 56, 44, 32, 55, 48, 44, 32, 54, 55, 44, 32, 51, 50, 44, 32, 54, 56, 44, 32, 57, 55, 44, 32, 49, 49, 54, 44, 32, 57, 55, 93]
or I can get a byte representation which is as follows:
B#2a80d889
Both of these are different from my sent data... I'm sure Im missing something truly simple....
Any help?!
You can't just take the returned string and construct a string from it... it's not a byte[] data type anymore, it's already a string; you need to parse it. For example :
String response = "[-47, 1, 16, 84, 2, 101, 110, 83, 111, 109, 101, 32, 78, 70, 67, 32, 68, 97, 116, 97]"; // response from the Python script
String[] byteValues = response.substring(1, response.length() - 1).split(",");
byte[] bytes = new byte[byteValues.length];
for (int i=0, len=bytes.length; i<len; i++) {
bytes[i] = Byte.parseByte(byteValues[i].trim());
}
String str = new String(bytes);
** EDIT **
You get an hint of your problem in your question, where you say "Whatever I seem to try I end up getting a byte array which looks as follows... [91, 45, ...", because 91 is the byte value for [, so [91, 45, ... is the byte array of the string "[-45, 1, 16, ..." string.
The method Arrays.toString() will return a String representation of the specified array; meaning that the returned value will not be a array anymore. For example :
byte[] b1 = new byte[] {97, 98, 99};
String s1 = Arrays.toString(b1);
String s2 = new String(b1);
System.out.println(s1); // -> "[97, 98, 99]"
System.out.println(s2); // -> "abc";
As you can see, s1 holds the string representation of the array b1, while s2 holds the string representation of the bytes contained in b1.
Now, in your problem, your server returns a string similar to s1, therefore to get the array representation back, you need the opposite constructor method. If s2.getBytes() is the opposite of new String(b1), you need to find the opposite of Arrays.toString(b1), thus the code I pasted in the first snippet of this answer.
String coolString = "cool string";
byte[] byteArray = coolString.getBytes();
String reconstitutedString = new String(byteArray);
System.out.println(reconstitutedString);
That outputs "cool string" to the console.
It's pretty darn easy.
What I did:
return to clients:
byte[] result = ****encrypted data****;
String str = Base64.encodeBase64String(result);
return str;
receive from clients:
byte[] bytes = Base64.decodeBase64(str);
your data will be transferred in this format:
OpfyN9paAouZ2Pw+gDgGsDWzjIphmaZbUyFx5oRIN1kkQ1tDbgoi84dRfklf1OZVdpAV7TonlTDHBOr93EXIEBoY1vuQnKXaG+CJyIfrCWbEENJ0gOVBr9W3OlFcGsZW5Cf9uirSmx/JLLxTrejZzbgq3lpToYc3vkyPy5Y/oFWYljy/3OcC/S458uZFOc/FfDqWGtT9pTUdxLDOwQ6EMe0oJBlMXm8J2tGnRja4F/aVHfQddha2nUMi6zlvAm8i9KnsWmQG//ok25EHDbrFBP2Ia/6Bx/SGS4skk/0couKwcPVXtTq8qpNh/aYK1mclg7TBKHfF+DHppwd30VULpA==
What Arrays.toString() does is create a string representation of each individual byte in your byteArray.
Please check the API documentation
Arrays API
To convert your response string back to the original byte array, you have to use split(",") or something and convert it into a collection and then convert each individual item in there to a byte to recreate your byte array.
Its simple to convert byte array to string and string back to byte array in java. we need to know when to use 'new' in the right way.
It can be done as follows:
byte array to string conversion:
byte[] bytes = initializeByteArray();
String str = new String(bytes);
String to byte array conversion:
String str = "Hello"
byte[] bytes = str.getBytes();
For more details, look at:
http://evverythingatonce.blogspot.in/2014/01/tech-talkbyte-array-and-string.html
The kind of output you are seeing from your byte array ([B#405217f8) is also an output for a zero length byte array (ie new byte[0]). It looks like this string is a reference to the array rather than a description of the contents of the array like we might expect from a regular collection's toString() method.
As with other respondents, I would point you to the String constructors that accept a byte[] parameter to construct a string from the contents of a byte array. You should be able to read raw bytes from a socket's InputStream if you want to obtain bytes from a TCP connection.
If you have already read those bytes as a String (using an InputStreamReader), then, the string can be converted to bytes using the getBytes() function. Be sure to pass in your desired character set to both the String constructor and getBytes() functions, and this will only work if the byte data can be converted to characters by the InputStreamReader.
If you want to deal with raw bytes you should really avoid using this stream reader layer.
Can you not just send the bytes as bytes, or convert each byte to a character and send as a string? Doing it like you are will take up a minimum of 85 characters in the string, when you only have 11 bytes to send. You could create a string representation of the bytes, so it'd be "[B#405217f8", which can easily be converted to a bytes or bytearray object in Python. Failing that, you could represent them as a series of hexadecimal digits ("5b42403430353231376638") taking up 22 characters, which could be easily decoded on the Python side using binascii.unhexlify().
[JDK8]
import java.util.Base64;
To string:
String str = Base64.getEncoder().encode(new byte[]{ -47, 1, 16, ... });
To byte array:
byte[] bytes = Base64.getDecoder().decode("JVBERi0xLjQKMyAwIG9iago8P...");
If you want to convert the string back into a byte array you will need to use String.getBytes() (or equivalent Python function) and this will allow you print out the original byte array.
Use the below code API to convert bytecode as string to Byte array.
byte[] byteArray = DatatypeConverter.parseBase64Binary("JVBERi0xLjQKMyAwIG9iago8P...");
[JAVA 8]
import java.util.Base64;
String dummy= "dummy string";
byte[] byteArray = dummy.getBytes();
byte[] salt = new byte[]{ -47, 1, 16, ... }
String encoded = Base64.getEncoder().encodeToString(salt);
You can do the following to convert byte array to string and then convert that string to byte array:
// 1. convert byte array to string and then string to byte array
// convert byte array to string
byte[] by_original = {0, 1, -2, 3, -4, -5, 6};
String str1 = Arrays.toString(by_original);
System.out.println(str1); // output: [0, 1, -2, 3, -4, -5, 6]
// convert string to byte array
String newString = str1.substring(1, str1.length()-1);
String[] stringArray = newString.split(", ");
byte[] by_new = new byte[stringArray.length];
for(int i=0; i<stringArray.length; i++) {
by_new[i] = (byte) Integer.parseInt(stringArray[i]);
}
System.out.println(Arrays.toString(by_new)); // output: [0, 1, -2, 3, -4, -5, 6]
But to convert the string to byte array and then convert that byte array to string, below approach can be used:
// 2. convert string to byte array and then byte array to string
// convert string to byte array
String str2 = "[0, 1, -2, 3, -4, -5, 6]";
byte[] byteStr2 = str2.getBytes(StandardCharsets.UTF_8);
// Now byteStr2 is [91, 48, 44, 32, 49, 44, 32, 45, 50, 44, 32, 51, 44, 32, 45, 52, 44, 32, 45, 53, 44, 32, 54, 93]
// convert byte array to string
System.out.println(new String(byteStr2, StandardCharsets.UTF_8)); // output: [0, 1, -2, 3, -4, -5, 6]
I have also answered the same in the following question:
https://stackoverflow.com/a/70486387/17364272

Categories