I'm loading large UTF-8 text from a SocketChannel, and need to extract some values. Pattern matching with java.util.regex is great for this, but decoding to Java's UTF-16 with CharBuffer cb = UTF_8.decode(buffer); copies this buffer, using double the space.
Is there a way to create a CharBuffer 'view' in UTF-8, or otherwise pattern match with a charset?
You can create lightweight CharSequence wrapping ByteBuffer which does simple byte to char conversion without proper UTF8 handling.
As long as your regex is Latin1 characters only, it would work event on "naively" converted string.
Only ranges matched by reg ex needs to be properly decodec from UTF8.
Below in code illustrating this approach.
import java.io.UnsupportedEncodingException;
import java.nio.ByteBuffer;
import java.nio.CharBuffer;
import java.nio.charset.Charset;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
import org.junit.Test;
import junit.framework.Assert;
public class RegExSnippet {
private static Charset UTF8 = Charset.forName("UTF8");
#Test
public void testByteBufferRegEx() throws UnsupportedEncodingException {
// this UTF8 byte encoding of test string
byte[] bytes = ("lkfmd;wmf;qmfqv amwfqwmf;c "
+ "<tag>This is some non ASCII text 'кирилицеский текст'</tag>"
+ "kjnfdlwncdlka-lksnflanvf ").getBytes(UTF8);
ByteBuffer bb = ByteBuffer.wrap(bytes);
ByteSeqWrapper bsw = new ByteSeqWrapper(bb);
// pattern should contain only LATIN1 characters
Matcher m = Pattern.compile("<tag>(.*)</tag>").matcher(bsw);
Assert.assertTrue(m.find());
String body = m.group(1);
// extracted part is properly decoded as UTF8
Assert.assertEquals("This is some non ASCII text 'кирилицеский текст'", body);
}
public static class ByteSeqWrapper implements CharSequence {
final ByteBuffer buffer;
public ByteSeqWrapper(ByteBuffer buf) {
this.buffer = buf;
}
#Override
public int length() {
return buffer.remaining();
}
#Override
public char charAt(int index) {
return (char) (0xFF & buffer.get(index));
}
#Override
public CharSequence subSequence(int start, int end) {
ByteBuffer bb = buffer.duplicate();
bb.position(bb.position() + start);
bb.limit(bb.position() + (end - start));
return new ByteSeqWrapper(bb);
}
#Override
public String toString() {
// a little hack to apply proper encoding
// to a parts extracted by matcher
CharBuffer cb = UTF8.decode(buffer);
return cb.toString();
}
}
}
I'm working on a windows app, and need to use some auth function from some previous java code. I have access to the Java source but still can't seem to get it right. Probably because of my limited knowledge of cryptography.
The Java functions I need to convert are :
public String getHMACHash(String SharedSecretKey, String TextToHash) {
return base64EncodedStringFromBytes(hmacMD5(SharedSecretKey, TextToHash));
}
private String base64EncodedStringFromBytes(byte[] bArr) {
return Base64.encodeToString(bArr, 2);
}
public byte[] hmacMD5(String SharedSecretKey, String TextToHash) {
byte[] bArr = null;
try {
Mac instance = Mac.getInstance("HmacMD5");
instance.init(new SecretKeySpec(SharedSecretKey.getBytes(), "HmacMD5"));
bArr = instance.doFinal(TextToHash.getBytes());
} catch (NoSuchAlgorithmException e) {
Log.m8401e(TAG, e.getLocalizedMessage());
} catch (InvalidKeyException e2) {
Log.m8401e(TAG, e2.getLocalizedMessage());
}
return bArr;
}
so when inputting the values :
SharedSecretKey = "497n9x98jK06gf7S3T7wJ2k455Qm192Q"
TextToHash = "1502322764327/customerservice.svc/buybackcartPOST8e802a045c1e60e"
the Hash generated is :
pOZNkg077OdvhyeMMPIX2w==
Try as I might I can't get near to the hash key using the same values in VB6. I have tried a few different methods to create the hash :
Private Function hash_HMACMD5(ByVal sTextToHash As String, ByVal
sSharedSecretKey As String)
Dim asc As Object, enc As Object
Dim TextToHash() As Byte
Dim SharedSecretKey() As Byte
Set asc = CreateObject("System.Text.UTF8Encoding")
Set enc = CreateObject("System.Security.Cryptography.HMACMD5")
TextToHash = asc.Getbytes_4(sTextToHash)
SharedSecretKey = asc.Getbytes_4(sSharedSecretKey)
enc.Key = SharedSecretKey
Dim bytes() As Byte
bytes = enc.ComputeHash_2((TextToHash))
hash_HMACMD5 = Base64Encode(bytes)
Set asc = Nothing
Set enc = Nothing
End Function
So, I was hoping someone out there might be able to point me in the right direction ?
Thanks In advance for any help.
Potman100
I've traced all the code through, and I can't see any thing that would indicate something different is going on. As mentioned below, there is a import line
import android.util.Base64;
The call to create the hash is :
String hMACHash = new MASecurity().getHMACHash(str, str2);
MASecurity Class is :
import android.util.Base64;
import java.io.UnsupportedEncodingException;
import java.security.InvalidKeyException;
import javax.crypto.Mac;
import javax.crypto.spec.SecretKeySpec;
public class MASecurity {
private static final String TAG = "MASecurity";
public String getHMACHash(String str, String str2) {
return base64EncodedStringFromBytes(hmacMD5(str, str2));
}
private String base64EncodedStringFromBytes(byte[] bArr) {
return Base64.encodeToString(bArr, 2);
}
public byte[] hmacMD5(String str, String str2) {
byte[] bArr = null;
try {
Mac instance = Mac.getInstance("HmacMD5");
instance.init(new SecretKeySpec(str.getBytes(), "HmacMD5"));
bArr = instance.doFinal(str2.getBytes());
} catch (NoSuchAlgorithmException e) {
MALog.m8401e(TAG, e.getLocalizedMessage());
} catch (InvalidKeyException e2) {
MALog.m8401e(TAG, e2.getLocalizedMessage());
}
return bArr;
}
The input values are correct, as they are logged whilst the app is running.
Hope this helps ??
Thanks Alex K., seems the Java code was adding more data to one of the params which the debugging I did missed, one I added the extra data it creates a valid hash.
I need to encode some data in the Base64 encoding in Java. How do I do that? What is the name of the class that provides a Base64 encoder?
I tried to use the sun.misc.BASE64Encoder class, without success. I have the following line of Java 7 code:
wr.write(new sun.misc.BASE64Encoder().encode(buf));
I'm using Eclipse. Eclipse marks this line as an error. I imported the required libraries:
import sun.misc.BASE64Encoder;
import sun.misc.BASE64Decoder;
But again, both of them are shown as errors. I found a similar post here.
I used Apache Commons as the solution suggested by including:
import org.apache.commons.*;
and importing the JAR files downloaded from: http://commons.apache.org/codec/
But the problem still exists. Eclipse still shows the errors previously mentioned. What should I do?
You need to change the import of your class:
import org.apache.commons.codec.binary.Base64;
And then change your class to use the Base64 class.
Here's some example code:
byte[] encodedBytes = Base64.encodeBase64("Test".getBytes());
System.out.println("encodedBytes " + new String(encodedBytes));
byte[] decodedBytes = Base64.decodeBase64(encodedBytes);
System.out.println("decodedBytes " + new String(decodedBytes));
Then read why you shouldn't use sun.* packages.
Update (2016-12-16)
You can now use java.util.Base64 with Java 8. First, import it as you normally do:
import java.util.Base64;
Then use the Base64 static methods as follows:
byte[] encodedBytes = Base64.getEncoder().encode("Test".getBytes());
System.out.println("encodedBytes " + new String(encodedBytes));
byte[] decodedBytes = Base64.getDecoder().decode(encodedBytes);
System.out.println("decodedBytes " + new String(decodedBytes));
If you directly want to encode string and get the result as encoded string, you can use this:
String encodeBytes = Base64.getEncoder().encodeToString((userName + ":" + password).getBytes());
See Java documentation for Base64 for more.
Use Java 8's never-too-late-to-join-in-the-fun class: java.util.Base64
new String(Base64.getEncoder().encode(bytes));
In Java 8 it can be done as:
Base64.getEncoder().encodeToString(string.getBytes(StandardCharsets.UTF_8))
Here is a short, self-contained complete example:
import java.nio.charset.StandardCharsets;
import java.util.Base64;
public class Temp {
public static void main(String... args) throws Exception {
final String s = "old crow medicine show";
final byte[] authBytes = s.getBytes(StandardCharsets.UTF_8);
final String encoded = Base64.getEncoder().encodeToString(authBytes);
System.out.println(s + " => " + encoded);
}
}
Output:
old crow medicine show => b2xkIGNyb3cgbWVkaWNpbmUgc2hvdw==
You can also convert using Base64 encoding. To do this, you can use the javax.xml.bind.DatatypeConverter#printBase64Binary method.
For example:
byte[] salt = new byte[] { 50, 111, 8, 53, 86, 35, -19, -47 };
System.out.println(DatatypeConverter.printBase64Binary(salt));
With Guava
pom.xml:
<dependency>
<artifactId>guava</artifactId>
<groupId>com.google.guava</groupId>
<type>jar</type>
<version>14.0.1</version>
</dependency>
Sample code:
// encode
String s = "Hello Việt Nam";
String base64 = BaseEncoding.base64().encode(s.getBytes("UTF-8"));
// decode
System.out.println("Base64:" + base64); // SGVsbG8gVmnhu4d0IE5hbQ==
byte[] bytes = BaseEncoding.base64().decode(base64);
System.out.println("Decoded: " + new String(bytes, "UTF-8")); // Hello Việt Nam
Eclipse gives you an error/warning because you are trying to use internal classes that are specific to a JDK vendor and not part of the public API. Jakarta Commons provides its own implementation of base64 codecs, which of course reside in a different package. Delete those imports and let Eclipse import the proper Commons classs for you.
For Java 6-7, the best option is to borrow code from the Android repository. It has no dependencies.
https://github.com/android/platform_frameworks_base/blob/master/core/java/android/util/Base64.java
Java 8 does contain its own implementation of Base64. However, I found one slightly disturbing difference. To illustrate, I will provide a code example:
My codec wrapper:
public interface MyCodec
{
static String apacheDecode(String encodedStr)
{
return new String(Base64.decodeBase64(encodedStr), Charset.forName("UTF-8"));
}
static String apacheEncode(String decodedStr)
{
byte[] decodedByteArr = decodedStr.getBytes(Charset.forName("UTF-8"));
return Base64.encodeBase64String(decodedByteArr);
}
static String javaDecode(String encodedStr)
{
return new String(java.util.Base64.getDecoder().decode(encodedStr), Charset.forName("UTF-8"));
}
static String javaEncode(String decodedStr)
{
byte[] decodedByteArr = decodedStr.getBytes(Charset.forName("UTF-8"));
return java.util.Base64.getEncoder().encodeToString(decodedByteArr);
}
}
Test Class:
public class CodecDemo
{
public static void main(String[] args)
{
String decodedText = "Hello World!";
String encodedApacheText = MyCodec.apacheEncode(decodedText);
String encodedJavaText = MyCodec.javaEncode(decodedText);
System.out.println("Apache encoded text: " + MyCodec.apacheEncode(encodedApacheText));
System.out.println("Java encoded text: " + MyCodec.javaEncode(encodedJavaText));
System.out.println("Encoded results equal: " + encodedApacheText.equals(encodedJavaText));
System.out.println("Apache decode Java: " + MyCodec.apacheDecode(encodedJavaText));
System.out.println("Java decode Java: " + MyCodec.javaDecode(encodedJavaText));
System.out.println("Apache decode Apache: " + MyCodec.apacheDecode(encodedApacheText));
System.out.println("Java decode Apache: " + MyCodec.javaDecode(encodedApacheText));
}
}
OUTPUT:
Apache encoded text: U0dWc2JHOGdWMjl5YkdRaA0K
Java encoded text: U0dWc2JHOGdWMjl5YkdRaA==
Encoded results equal: false
Apache decode Java: Hello World!
Java decode Java: Hello World!
Apache decode Apache: Hello World!
Exception in thread "main" java.lang.IllegalArgumentException: Illegal base64 character d
at java.util.Base64$Decoder.decode0(Base64.java:714)
at java.util.Base64$Decoder.decode(Base64.java:526)
at java.util.Base64$Decoder.decode(Base64.java:549)
Notice that the Apache encoded text contain additional line breaks (white spaces) at the end. Therefore, in order for my codec to yield the same result regardless of Base64 implementation, I had to call trim() on the Apache encoded text. In my case, I simply added the aforementioned method call to the my codec's apacheDecode() as follows:
return Base64.encodeBase64String(decodedByteArr).trim();
Once this change was made, the results are what I expected to begin with:
Apache encoded text: U0dWc2JHOGdWMjl5YkdRaA==
Java encoded text: U0dWc2JHOGdWMjl5YkdRaA==
Encoded results equal: true
Apache decode Java: Hello World!
Java decode Java: Hello World!
Apache decode Apache: Hello World!
Java decode Apache: Hello World!
CONCLUSION: If you want to switch from Apache Base64 to Java, you must:
Decode encoded text with your Apache decoder.
Encode resulting (plain) text with Java.
If you switch without following these steps, most likely you will run into problems. That is how I made this discovery.
To convert this, you need an encoder & decoder which you will get from Base64Coder - an open-source Base64 encoder/decoder in Java. It is file Base64Coder.java you will need.
Now to access this class as per your requirement you will need the class below:
import java.io.BufferedInputStream;
import java.io.BufferedOutputStream;
import java.io.BufferedReader;
import java.io.BufferedWriter;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.FileReader;
import java.io.FileWriter;
import java.io.InputStream;
import java.io.IOException;
import java.io.OutputStream;
public class Base64 {
public static void main(String args[]) throws IOException {
/*
* if (args.length != 2) {
* System.out.println(
* "Command line parameters: inputFileName outputFileName");
* System.exit(9);
* } encodeFile(args[0], args[1]);
*/
File sourceImage = new File("back3.png");
File sourceImage64 = new File("back3.txt");
File destImage = new File("back4.png");
encodeFile(sourceImage, sourceImage64);
decodeFile(sourceImage64, destImage);
}
private static void encodeFile(File inputFile, File outputFile) throws IOException {
BufferedInputStream in = null;
BufferedWriter out = null;
try {
in = new BufferedInputStream(new FileInputStream(inputFile));
out = new BufferedWriter(new FileWriter(outputFile));
encodeStream(in, out);
out.flush();
}
finally {
if (in != null)
in.close();
if (out != null)
out.close();
}
}
private static void encodeStream(InputStream in, BufferedWriter out) throws IOException {
int lineLength = 72;
byte[] buf = new byte[lineLength / 4 * 3];
while (true) {
int len = in.read(buf);
if (len <= 0)
break;
out.write(Base64Coder.encode(buf, 0, len));
out.newLine();
}
}
static String encodeArray(byte[] in) throws IOException {
StringBuffer out = new StringBuffer();
out.append(Base64Coder.encode(in, 0, in.length));
return out.toString();
}
static byte[] decodeArray(String in) throws IOException {
byte[] buf = Base64Coder.decodeLines(in);
return buf;
}
private static void decodeFile(File inputFile, File outputFile) throws IOException {
BufferedReader in = null;
BufferedOutputStream out = null;
try {
in = new BufferedReader(new FileReader(inputFile));
out = new BufferedOutputStream(new FileOutputStream(outputFile));
decodeStream(in, out);
out.flush();
}
finally {
if (in != null)
in.close();
if (out != null)
out.close();
}
}
private static void decodeStream(BufferedReader in, OutputStream out) throws IOException {
while (true) {
String s = in.readLine();
if (s == null)
break;
byte[] buf = Base64Coder.decodeLines(s);
out.write(buf);
}
}
}
In Android you can convert your bitmap to Base64 for Uploading to a server or web service.
Bitmap bmImage = //Data
ByteArrayOutputStream baos = new ByteArrayOutputStream();
bmImage.compress(Bitmap.CompressFormat.JPEG, 100, baos);
byte[] imageData = baos.toByteArray();
String encodedImage = Base64.encodeArray(imageData);
This “encodedImage” is text representation of your image. You can use this for either uploading purpose or for diplaying directly into an HTML page as below (reference):
<img alt="" src="data:image/png;base64,<?php echo $encodedImage; ?>" width="100px" />
<img alt="" src="...........1f/9k=" width="100px" />
Documentation: http://dwij.co.in/java-base64-image-encoder
On Android, use the static methods of the android.util.Base64 utility class. The referenced documentation says that the Base64 class was added in API level 8 (Android 2.2 (Froyo)).
import android.util.Base64;
byte[] encodedBytes = Base64.encode("Test".getBytes());
Log.d("tag", "encodedBytes " + new String(encodedBytes));
byte[] decodedBytes = Base64.decode(encodedBytes);
Log.d("tag", "decodedBytes " + new String(decodedBytes));
Apache Commons has a nice implementation of Base64. You can do this as simply as:
// Encrypt data on your side using BASE64
byte[] bytesEncoded = Base64.encodeBase64(str .getBytes());
System.out.println("ecncoded value is " + new String(bytesEncoded));
// Decrypt data on other side, by processing encoded data
byte[] valueDecoded= Base64.decodeBase64(bytesEncoded );
System.out.println("Decoded value is " + new String(valueDecoded));
You can find more details about base64 encoding at Base64 encoding using Java and JavaScript.
If you are using Spring Framework at least version 4.1, you can use the org.springframework.util.Base64Utils class:
byte[] raw = { 1, 2, 3 };
String encoded = Base64Utils.encodeToString(raw);
byte[] decoded = Base64Utils.decodeFromString(encoded);
It will delegate to Java 8's Base64, Apache Commons Codec, or JAXB DatatypeConverter, depending on what is available.
Simple example with Java 8:
import java.util.Base64;
String str = "your string";
String encodedStr = Base64.getEncoder().encodeToString(str.getBytes("utf-8"));
In Java 7 I coded this method
import javax.xml.bind.DatatypeConverter;
public static String toBase64(String data) {
return DatatypeConverter.printBase64Binary(data.getBytes());
}
If you are stuck to an earlier version of Java than 8 but already using AWS SDK for Java, you can use com.amazonaws.util.Base64.
I tried with the following code snippet. It worked well. :-)
com.sun.org.apache.xml.internal.security.utils.Base64.encode("The string to encode goes here");
public String convertImageToBase64(String filePath) {
byte[] fileContent = new byte[0];
String base64encoded = null;
try {
fileContent = FileUtils.readFileToByteArray(new File(filePath));
} catch (IOException e) {
log.error("Error reading file: {}", filePath);
}
try {
base64encoded = Base64.getEncoder().encodeToString(fileContent);
} catch (Exception e) {
log.error("Error encoding the image to base64", e);
}
return base64encoded;
}
GZIP + Base64
The length of the string in a Base64 format is greater then original: 133% on average. So it makes sense to first compress it with GZIP, and then encode to Base64. It gives a reduction of up to 77% for strings greater than 200 characters and more. Example:
public static void main(String[] args) throws IOException {
byte[] original = randomString(100).getBytes(StandardCharsets.UTF_8);
byte[] base64 = encodeToBase64(original);
byte[] gzipToBase64 = encodeToBase64(encodeToGZIP(original));
byte[] fromBase64 = decodeFromBase64(base64);
byte[] fromBase64Gzip = decodeFromGZIP(decodeFromBase64(gzipToBase64));
// test
System.out.println("Original: " + original.length + " bytes, 100%");
System.out.println("Base64: " + base64.length + " bytes, "
+ (base64.length * 100 / original.length) + "%");
System.out.println("GZIP+Base64: " + gzipToBase64.length + " bytes, "
+ (gzipToBase64.length * 100 / original.length) + "%");
//Original: 3700 bytes, 100%
//Base64: 4936 bytes, 133%
//GZIP+Base64: 2868 bytes, 77%
System.out.println(Arrays.equals(original, fromBase64)); // true
System.out.println(Arrays.equals(original, fromBase64Gzip)); // true
}
public static byte[] decodeFromBase64(byte[] arr) {
return Base64.getDecoder().decode(arr);
}
public static byte[] encodeToBase64(byte[] arr) {
return Base64.getEncoder().encode(arr);
}
public static byte[] decodeFromGZIP(byte[] arr) throws IOException {
ByteArrayInputStream bais = new ByteArrayInputStream(arr);
GZIPInputStream gzip = new GZIPInputStream(bais);
return gzip.readAllBytes();
}
public static byte[] encodeToGZIP(byte[] arr) throws IOException {
ByteArrayOutputStream baos = new ByteArrayOutputStream();
GZIPOutputStream gzip = new GZIPOutputStream(baos);
gzip.write(arr);
gzip.finish();
return baos.toByteArray();
}
public static String randomString(int count) {
StringBuilder str = new StringBuilder();
for (int i = 0; i < count; i++) {
str.append(" ").append(UUID.randomUUID().toString());
}
return str.toString();
}
See also: How to get the JAR file for sun.misc.BASE64Encoder class?
add this library into your app level dependancies
implementation 'org.apache.commons:commons-collections4:4.4'
I am currently trying to get Java to generate the same hash for a string as PHP's hash algorithm does.
I have come close enough:
hash('sha512', 'password');
outputs:
b109f3bbbc244eb82441917ed06d618b9008dd09b3befd1b5e07394c706a8bb980b1d7785e5976ec049b46df5f1326af5a2ea6d103fd07c95385ffab0cacbc86
Java code:
public static void main(String[] args) {
hash("password");
}
private static String hash(String salted) {
byte[] digest;
try {
MessageDigest mda = MessageDigest.getInstance("SHA-512");
digest = mda.digest(salted.getBytes("UTF-8"));
} catch (Exception e) {
digest = new byte[]{};
}
String str = "";
for (byte aDigest : digest) {
str += String.format("%02x", 0xFF & aDigest);
}
return str;
}
This outputs the same.
My problem is when I use the third argument within PHP's hash function. On PHP's site it's described as following:
raw_output
When set to TRUE, outputs raw binary data. FALSE outputs lowercase hexits.
I am not quite sure how to implement this extra parameter. I think mainly my question would be, how do I convert a String object into a binary String object? Currently, running it with PHP generates the following: http://sandbox.onlinephpfunctions.com/code/a1bd9b399b3ac0c4db611fe748998f18738d19e3
This should reproduce the outcome from your link:
String strBinary = null;
try {
strBinary = new String(digest, "UTF-8");
} catch (UnsupportedEncodingException e) {
}
and you'll need these imports at the top of your file:
import java.nio.charset.Charset;
import java.io.UnsupportedEncodingException;
I hope I understood your issue correctly.
Given the following code:
String tmp = new String("\\u0068\\u0065\\u006c\\u006c\\u006f\\u000a");
String result = convertToEffectiveString(tmp); // result contain now "hello\n"
Does the JDK already provide some classes for doing this ?
Is there a libray that does this ? (preferably under maven)
I have tried with ByteArrayOutputStream with no success.
This works, but only with ASCII. If you use unicode characters outside of the ASCCI range, then you will have problems (as each character is being stuffed into a byte, instead of a full word that is allowed by UTF-8). You can do the typecast below because you know that the UTF-8 will not overflow one byte if you guaranteed that the input is basically ASCII (as you mention in your comments).
package sample;
import java.io.UnsupportedEncodingException;
public class UnicodeSample {
public static final int HEXADECIMAL = 16;
public static void main(String[] args) {
try {
String str = "\\u0068\\u0065\\u006c\\u006c\\u006f\\u000a";
String arr[] = str.replaceAll("\\\\u"," ").trim().split(" ");
byte[] utf8 = new byte[arr.length];
int index=0;
for (String ch : arr) {
utf8[index++] = (byte)Integer.parseInt(ch,HEXADECIMAL);
}
String newStr = new String(utf8, "UTF-8");
System.out.println(newStr);
}
catch (UnsupportedEncodingException e) {
// handle the UTF-8 conversion exception
}
}
}
Here is another solution that fixes the issue of only working with ASCII characters. This will work with any unicode characters in the UTF-8 range instead of ASCII only in the first 8-bits of the range. Thanks to deceze for the questions. You made me think more about the problem and solution.
package sample;
import java.io.UnsupportedEncodingException;
import java.util.ArrayList;
public class UnicodeSample {
public static final int HEXADECIMAL = 16;
public static void main(String[] args) {
try {
String str = "\\u0068\\u0065\\u006c\\u006c\\u006f\\u000a\\u3fff\\uf34c";
ArrayList<Byte> arrList = new ArrayList<Byte>();
String codes[] = str.replaceAll("\\\\u"," ").trim().split(" ");
for (String c : codes) {
int code = Integer.parseInt(c,HEXADECIMAL);
byte[] bytes = intToByteArray(code);
for (byte b : bytes) {
if (b != 0) arrList.add(b);
}
}
byte[] utf8 = new byte[arrList.size()];
for (int i=0; i<arrList.size(); i++) utf8[i] = arrList.get(i);
str = new String(utf8, "UTF-8");
System.out.println(str);
}
catch (UnsupportedEncodingException e) {
// handle the exception when
}
}
// Takes a 4 byte integer and and extracts each byte
public static final byte[] intToByteArray(int value) {
return new byte[] {
(byte) (value >>> 24),
(byte) (value >>> 16),
(byte) (value >>> 8),
(byte) (value)
};
}
}
Firstly, are you just trying to parse a string literal, or is tmp going to be some user-entered data?
If this is going to be a string literal (i.e. hard-coded string), it can be encoded using Unicode escapes. In your case, this just means using single backslashes instead of double backslashes:
String result = "\u0068\u0065\u006c\u006c\u006f\u000a";
If, however, you need to use Java's string parsing rules to parse user input, a good starting point might be Apache Commons Lang's StringEscapeUtils.unescapeJava() method.
I'm sure there must be a better way, but using just the JDK:
public static String handleEscapes(final String s)
{
final java.util.Properties props = new java.util.Properties();
props.setProperty("foo", s);
final java.io.ByteArrayOutputStream baos = new java.io.ByteArrayOutputStream();
try
{
props.store(baos, null);
final String tmp = baos.toString().replace("\\\\", "\\");
props.load(new java.io.StringReader(tmp));
}
catch(final java.io.IOException ioe) // shouldn't happen
{ throw new RuntimeException(ioe); }
return props.getProperty("foo");
}
uses java.util.Properties.load(java.io.Reader) to process the backslash-escapes (after first using java.util.Properties.store(java.io.OutputStream, java.lang.String) to backslash-escape anything that would cause problems in a properties-file, and then using replace("\\\\", "\\") to reverse the backslash-escaping of the original backslashes).
(Disclaimer: even though I tested all the cases I could think of, there are still probably some that I didn't think of.)