To convert ruby unpack equivalent in java - java

I'm trying to understand this line of ruby code:
token.unpack('m0').first.unpack('H*').first
which is converting
R1YKdH//cZKubZlA09ZIVZQ5/cxInvmokIACnl3MKJ0=
to
47560a747fff7192ae6d9940d3d648559439fdcc489ef9a89080029e5dcc289d
As far as I understand this, it is base64 to hex conversion, but when I try to do the same thing, its not matching with converted one.
I need to implement the same functionality in Java.

So I'm going to break this down. The first step is token.unpack('m0'). According to Idiosyncratic Ruby unpack('m0') will decode base64, similarly to the built-in Base64 libararies Base64.decode64(string) function. But unpack returns an arry here, with only 1 element, the converted base64. So we use token.unpack('m0').first to get the first (and in this case the only) element of the array returned by token.unpack('m0'). If this was all, then you'd be correct to say that it's just base64. But, the unpacked base64 is unpacked again, this time with 'H*', which will convert the characters to hex. And finally, because that will return an array, you use first again to make it only a string.
So in summary, what is happening is that first your string is being decoded from base64 to a string, then it's being converted to hex.

Related

Getting UTF8 strings directly from the web service call without conversion to String

I'm using CXF to implement a web services server.
Since I'm low on memory I don't want the web service call parameters to be translated strings which are UTF-16 I rather access the original UTF-8 buffers which are usually half in size in my case.
So if I have a web method:
void addBook(String bookText)
How can I get the bookText without CXF translating it to java string?
The XML parsers used in Java (StAX parsers for CXF) only allow getting the XML contents as either a String or char[]. Thus, it wouldn't be possible to get the raw bytes.
If you have a String object in java, there is no such thing as whether it is UTF-8 or UTF-16 string. The encoding comes in when you convert a String to or from a byte array.
A String in java is a character array. If you already have a String object in java (for example passed as a parameter to your addBook() method, it has already been interpreted properly and converted to a character array.
If you want to avoid character encoding conversions, the only way to do that is to define your method to receive a byte array instead of a String:
void addBook(byte[] bookTextUtf16);
But keep in mind that in this way you have to "remember" the encoding in which the byte array is valid (adding it to the name is one way).
If you need a java.lang.String object, then there is nothing you can do. A String is a character array, characters with which each being a 16-bit value. This is String internal, no way to change the internal representation. Either accept this or don't use java.lang.String to represent your strings.
An alternative way could be to create your own Text class for example which honds the UTF-8 encoded byte array for example, and as long as you don't need the String representation, keep it as a byte array and store it if you want to. Only create the java.lang.String instance when you do need the String.

Convert from unrecognised character to normal form

I have a error with my file. That is, all the characters are like "Giá»âºi tính". I want to use Java to write a program that convert those characters to normal ones. I have tried to convert them to bytes and then convert again to String but it remained the same.
You need to know the encoding of the file in order to do this. Java internally represents all Strings as UTF-16; in order to fix the issue, you need to know the encoding of the file, and use that encoding when reading the file: http://goo.gl/PoBgo (Java API Docs)

How to get 'original' bytes of a Java String when read from DataOutputStream.writeUTF()?

Currently I'm transferring a String across the network, using DataInput/OutputStream's. The String I am transferring needs to be converted into a byte array, to be decrypted.
However, since when the string was written using DataOutputStream.writeUTF("foobar"), its byte array contains encoded Java Modified UTF-8 data, which stuffs up the encryption process.
How can I get the original bytes from the Java modified UTF-8 String?
Unicode has several variants, where s-with-^ can either be one character or two: s plus combining-^. Java has a Normalizer class to convert to one specific variant.
See http://docs.oracle.com/javase/tutorial/i18n/text/normalizerapi.html
or look immediately at the API.
This requires that the original string adheres to one variant. One cannot take bytes and then interprete them as UTF-8, because there are illegal sequences. This was done to prevent recognizing a wrong byte/character when in the middle of a byte sequence.
String normalizedString = Normalizer.normalize(s, Normalizer.Form.NFD);
What if you write your string as byte[] and read it as byte[] using http://docs.oracle.com/javase/1.4.2/docs/api/java/io/DataOutputStream.html#write(byte[], int, int)

Why does the Blowfish output in Java and PHP differ by only 2 chars?

I have a blowfish encryption script in PHP and JAVA vice versa that was working fine until today when I came across a problem.
The same content is encrypted differently in Java vs PHP by only 2 chars, which is really weird.
PHP
wTHzxfxLHdMm/JMFnoh0hciS/JADvFFg
Java
wTHzxfxLHdMm/JMFnoh0hciS/D8DvFFg
-------------------------^^
As you see those two positions do not match. Unfortunately the value is a real email address and I can't share it. Also I was not able to reproduce the problem with other few values I've tested. I've tried changing Base64 encode classes on Java, and that neither helped.
The source code for PHP is here, and for Java is here.
What could I do to resolve this problem?
Let's have a look at your Java code:
String c = new String(Test.encrypt((new String("thevalue")).getBytes(),
(new String("mykey")).getBytes()));
...
System.out.println("Base64 encoded String:" +
new sun.misc.BASE64Encoder().encode(c.getBytes()));
What you are doing here is:
Convert the plaintext string to bytes, using the system's default encoding
convert the key to bytes, using the system's default encoding
encrypt the bytes
convert the encrypted bytes back to a string, using the system's default encoding
convert the encrypted string back to bytes, using the system's default encoding
encode these encrypted bytes using Base64.
The problem is in step 4. It assumes that an arbitrary byte array represents a string in your system's default encoding, and encoding this string back gives the same byte[]. This is valid for some encodings (the ISO-8859 series, for example), but not for others. In Java, when some byte (or byte sequence) is not representable in the given encoding, it will be replaced by some other character, which later for reconverting will be mapped to byte 63 (ASCII ?). Actually, the documentation even says:
The behavior of this constructor when the given bytes are not valid in the default charset is unspecified.
In your case, there is no reason to do this at all - simply use the bytes which your encrypt method outputs directly to convert them to Base64.
byte[] encrypted = Test.encrypt("thevalue".getBytes(),
"mykey".getBytes());
System.out.println("Base64 encoded String:"+ new sun.misc.BASE64Encoder().encode(encrypted));
(Also note that I removed the superfluous new String("...") constructor calls here, though this does not relate to your problem.)
The point to remember: Never ever convert an arbitrary byte[], which did not come from encoding a string, to a string. Output of an encryption algorithm (and most other cryptographic algorithms, except decryption) certainly belongs to the category of data which should not be converted to a string.
And never ever use the System's default encoding, if you want portable programs.
Your code seems right to me.
It looks like you have a trailing white space in the input to one of these programs, and it is only one. I'll tell you why:
Each of these 4-char blocks represent 3 characters in the encrypted string. Th different part (JA and D8 in the 7th block) actually come from a single different character.
wTHz xfxL HdMm /JMF noh0 hciS /JAD vFFg
wTHz xfxL HdMm /JMF noh0 hciS /D8D vFFg
If I have got it right your email address is 19 characters long. The 20th character in one of your input strings is a white space.
Question: Have you tried the associated PHP decryption library to decrypt the PHP generated encrypted text? Have you tried the associated JAVA decryption library to decrypt the JAVA encrypted text?
If both produce differing outputs, then one MUST fail decrypting.
Is that one PHP, or Java?
Whichever one it is -- I would try to duplicate another such failure with a publicly shareable string... give that string as a unit test -- to the developer or developers that created the encrypt/decrypt code in the language that the round-trip encrypt/decrypt fails in.
Then... wait for them to fix it.
Not sure of any faster solutions -- except maybe change encryption/decryption library providers... or roll your own...

Need help removing strange characters from string

Currently when I make a signature using java.security.signature, it passes back a string.
I can't seem to use this string since there are special characters that can only be seen when i copy the string into notepad++, from there if I remove these special characters I can use the remains of the string in my program.
In notepad they look like black boxes with the words ACK GS STX SI SUB ETB BS VT
I don't really understand what they are so its hard to tell how to get ride of them.
Is there a function that i can run to remove these and potentially similar characters?
when i use the base64 class supplied in the posts, i cant go back to a signature
System.out.println(signature);
String base64 = Base64.encodeBytes(sig);
System.out.println(base64);
String sig2 = new String (Base64.decode(base64));
System.out.println(sig2);
gives the output
”zÌý¥y]žd”xKmËY³ÕN´Ìå}ÏBÊNÈ›`Αrp~jÖüñ0…Rõ…•éh?ÞÀ_û_¥ÂçªsÂk{6H7œÉ/”âtTK±Ï…Ã/Ùê²
lHrM/aV5XZ5klHhLbctZs9VOtMzlfc9Cyk7Im2DOkXJwfmoG1vzxMIVS9YWV6Wg/HQLewF/7X6XC56pzwmt7DzZIN5zJL5TidFRLsc+Fwy/Z6rIaNA2uVlCh3XYkWcu882tKt2RySSkn1heWhG0IeNNfopAvbmHDlgszaWaXYzY=
[B#15356d5
The odd characters are there because cryptographic signatures produce bytes rather than strings. Consequently if you want a printable representation you should Base64 encode it (here's a public domain implementation for Java).
Stripping the non-printing characters from a cryptographic signature will render it useless as you will be unable to use it for verification.
Update:
[B#15356d5
This is the result of toString called on a byte array. "[" means array, "B" means byte and "15356d5" is the address of the array. You should be passing the array you get out of decode to [Signature.verify](http://java.sun.com/j2se/1.4.2/docs/api/java/security/Signature.html#verify(byte[])).
Something like:
Signature sig = new Signature("dsa");
sig.initVerify(key);
sig.verify(Base64.decode(base64)); // <-- bytes go here
How are you "making" the signature? If you use the sign method, you get back a byte array, not a string. That's not a binary representation of some text, it's just arbitrary binary data. That's what you should use, and if you need to convert it into a string you should use a base64 conversion to avoid data corruption.
If I understand your problem correctly, you need to get rid of characters with code below 32, except maybe char 9 (tab), char 10 (new line) and char 13 (return).
Edit: I agree with the others as handling a crypto output like this is not what you usually want.

Categories