Convert String to bytes, and back again - java

I have a string cityName which I decoded into bytes as follows:
byte[] cityBytes = cityName.getBytes("UTF-8");
...and stored the bytes somewhere. When I retrieve those bytes, how can I decode them back into a string?

Use the String(byte[], Charset) or String(byte[], String) constructor.
byte[] rawBytes = /* whatevs */
try
{
String decoded = new String(rawBytes, Charset.forName("UTF-8"));
// or
String decoded = new String(rawBytes, "UTF-8");
// best, if you're using Java 7 (thanks to #ColinD):
String decoded = new String(rawBytes, StandardCharsets.UTF_8);
}
catch (UnsupportedEncodingException e)
{
// see http://stackoverflow.com/a/6030187/139010
throw new AssertionError("UTF-8 not supported");
}

The String class has a few constructors that accept an array of bytes, including one that takes an array of bytes and a String representation of a charset and another that takes a Charset object. There are also constructors that take the offset and length of the String as arguments, if the String is only a small section of the byte array.

Like this:
String cityName = new String(cityByte,"UTF-8");

String s = new String(cityByte, "UTF-8");

Try this: http://docs.oracle.com/javase/6/docs/api/java/lang/String.html
String(byte[] bytes, String charsetName)

Related

Converting byte array to String Java

I wish to convert a byte array to String but as I do so, my String has 00 before every digit got from the array.
I should have got the following result: 49443a3c3532333437342e313533373936313835323237382e303e
But I have the following:
Please help me, how can I get the nulls away?
I have tried the following ways to convert:
xxxxId is the byteArray
String xxxIdString = new String(Hex.encodeHex(xxxxId));
Thank you!
Try something like this:
String s = new String(bytes);
s = s.replace("\0", "")
It's also posible, that the string will end after the first '\0' received, if thats the case, first iterate through the array and replace '\0' with something like '\n' and do this:
String s = new String(bytes);
s = s.replace("\n", "")
EDIT:
use this for a BYTE-ARRAY:
String s = new String(bytes, StandardCharsets.UTF_8);
use this for a CHAR:
String s = new String(bytes);
Try below code:
byte[] bytes = {...}
String str = new String(bytes, "UTF-8"); // for UTF-8 encoding
please have a look here- How to convert byte array to string and vice versa?
In order to convert Byte array into String format correctly, we have to explicitly create a String object and assign the Byte array to it.
String example = "This is an example";
byte[] bytes = example.getBytes();
String s = new String(bytes);

Letter with trema being shown as percentage sign

In my program I convert a byte stream I get as input to a String. But when the bytestream contains words with a ë, this letter is converted to a %. How do I fix this?
Thx
For encoding these characters,
Convert the String object to UTF-8, invoke the getBytes method and specify the appropriate encoding identifier as a parameter. The getBytes method returns an array of bytes in UTF-8 format. To create a String object from an array of non-Unicode bytes, invoke the String constructor with the encoding parameter. Refer this,
try {
byte[] utf8Bytes = original.getBytes("UTF8");
byte[] defaultBytes = original.getBytes();
String roundTrip = new String(utf8Bytes, "UTF8");
System.out.println("roundTrip = " + roundTrip);
System.out.println();
printBytes(utf8Bytes, "utf8Bytes");
System.out.println();
printBytes(defaultBytes, "defaultBytes");
}
catch (UnsupportedEncodingException e) {
e.printStackTrace();
}

Convert a part of ByteBuffer back to String

I have a big String that was once converted to a ByteBuffer & then while reading later several times, only a portion of the String(overview of the text) needs to be presented, so I want to convert only a part of the ByteBuffer to String.
Is it possible to convert only a part of bytebuffer to string rather than [converting entire Bytebuffer to String & then using substring()]
try {
ByteBuffer bbuf = encoder.encode(CharBuffer.wrap(yourstr));
bbuf.position(0);
bbuf.limit(200);
CharBuffer cbuf = decoder.decode(bbuf);
String s = cbuf.toString();
System.out.println(s);
} catch (CharacterCodingException e) {
}
Which should return chars from the byte buffer starting at 0. byte and ending in 200.
Or rather:
ByteBuffer bbuf = ByteBuffer.wrap(yourstr.getBytes());
bbuf.position(0);
bbuf.limit(200);
byte[] bytearr = new byte[bbuf.remaining()];
bbuf.get(bytearr);
String s = new String(bytearr);
Which does the same but without explicit character decoding/encoding.
Decoding of course does happen in constructor of String s and it is platform dependent, so watch out.
// convert all byteBuffer to string
String fullByteBuffer = new String(byteBuffer.array());
// convert part of byteBuffer to string
byte[] partOfByteBuffer = new byte[PART_LENGTH];
System.arraycopy(fullByteBuffer.array(), 0, partOfByteBuffer, 0, partOfByteBuffer.length);
String partOfByteBufferString = new String(partOfByteBuffer.array());

GZIP decompress string and byte conversion

I have a problem in code:
private static String compress(String str)
{
String str1 = null;
ByteArrayOutputStream bos = null;
try
{
bos = new ByteArrayOutputStream();
BufferedOutputStream dest = null;
byte b[] = str.getBytes();
GZIPOutputStream gz = new GZIPOutputStream(bos,b.length);
gz.write(b,0,b.length);
bos.close();
gz.close();
}
catch(Exception e) {
System.out.println(e);
e.printStackTrace();
}
byte b1[] = bos.toByteArray();
return new String(b1);
}
private static String deCompress(String str)
{
String s1 = null;
try
{
byte b[] = str.getBytes();
InputStream bais = new ByteArrayInputStream(b);
GZIPInputStream gs = new GZIPInputStream(bais);
ByteArrayOutputStream baos = new ByteArrayOutputStream();
int numBytesRead = 0;
byte [] tempBytes = new byte[6000];
try
{
while ((numBytesRead = gs.read(tempBytes, 0, tempBytes.length)) != -1)
{
baos.write(tempBytes, 0, numBytesRead);
}
s1 = new String(baos.toByteArray());
s1= baos.toString();
}
catch(ZipException e)
{
e.printStackTrace();
}
}
catch(Exception e) {
e.printStackTrace();
}
return s1;
}
public String test() throws Exception
{
String str = "teststring";
String cmpr = compress(str);
String dcmpr = deCompress(cmpr);
}
This code throw java.io.IOException: unknown format (magic number ef1f)
GZIPInputStream gs = new GZIPInputStream(bais);
It turns out that when converting byte new String (b1) and the byte b [] = str.getBytes () bytes are "spoiled." At the output of the line we have already more bytes. If you avoid the conversion to a string and work on the line with bytes - everything works. Sorry for my English.
public String unZip(String zipped) throws DataFormatException, IOException {
byte[] bytes = zipped.getBytes("WINDOWS-1251");
Inflater decompressed = new Inflater();
decompressed.setInput(bytes);
byte[] result = new byte[100];
ByteArrayOutputStream buffer = new ByteArrayOutputStream();
while (decompressed.inflate(result) != 0)
buffer.write(result);
decompressed.end();
return new String(buffer.toByteArray(), charset);
}
I'm use this function to decompress server responce. Thanks for help.
You have two problems:
You're using the default character encoding to convert the original string into bytes. That will vary by platform. It's better to specify an encoding - UTF-8 is usually a good idea.
You're trying to represent the opaque binary data of the result of the compression as a string by just calling the String(byte[]) constructor. That constructor is only meant for data which is encoded text... which this isn't. You should use base64 for this. There's a public domain base64 library which makes this easy. (Alternatively, don't convert the compressed data to text at all - just return a byte array.)
Fundamentally, you need to understand how different text and binary data are - when you want to convert between the two, you should do so carefully. If you want to represent "non text" binary data (i.e. bytes which aren't the direct result of encoding text) in a string you should use something like base64 or hex. When you want to encode a string as binary data (e.g. to write some text to disk) you should carefully consider which encoding to use. If another program is going to read your data, you need to work out what encoding it expects - if you have full control over it yourself, I'd usually go for UTF-8.
Additionally, the exception handling in your code is poor:
You should almost never catch Exception; catch more specific exceptions
You shouldn't just catch an exception and continue as if it had never happened. If you can't really handle the exception and still complete your method successfully, you should let the exception bubble up the stack (or possibly catch it and wrap it in a more appropriate exception type for your abstraction)
When you GZIP compress data, you always get binary data. This data cannot be converted into string as it is no valid character data (in any encoding).
So your compress method should return a byte array and your decompress method should take a byte array as its parameter.
Futhermore, I recommend you use an explicit encoding when you convert the string into a byte array before compression and when you turn the decompressed data into a string again.
When you GZIP compress data, you always get binary data. This data
cannot be converted into string as it is no valid character data (in
any encoding).
Codo is right, thanks a lot for enlightening me. I was trying to decompress a string (converted from the binary data). What I amended was using InflaterInputStream directly on the input stream returned by my http connection. (My app was retrieving a large JSON of strings)

Converting part of a ByteBuffer to a String

I have a ByteBuffer containing bytes that were derived by String.getBytes(charsetName), where "containing" means that the string comprises the entire sequence of bytes between the ByteBuffer's position() and limit().
What's the best way for me to get the string back? (assuming I know the encoding charset) Is there anything better than the following (which seems a little clunky)
byte[] ba = new byte[bbuf.remaining()];
bbuf.get(ba);
try {
String s = new String(ba, charsetName);
}
catch (UnsupportedEncodingException e) {
/* take appropriate action */
}
String s = Charset.forName(charsetName).decode(bbuf).toString();

Categories