Problem in putting string in bytebuffer java - java

when I put a string to byte buffer it adds some unknown chars to it.
here is my code:
String request="HELLO";
ByteBuffer buffer=ByteBuffer.allocate(1024);
buffer.clear();
buffer.put(request.getBytes());
buffer.flip();
when I convert it to the string I get the following result: HELLO��������
The way I convert ByteBuffer to string is below:
new String(buffer.array())

When creating the string, you didn't take into account that only some of the bytes in the buffer had valid data. The first 5 bytes contain "hello" encoded in some form, the rest are filled with zeros.
To convert a byte buffer to a string, use the Charset class:
CharBuffer cb = Charset.defaultCharset().decode(buffer);
String str = cb.toString();

Related

Converting bytes with Charset results in diamonds at end of string?

I am currently storing a String as an array of bytes. However, when I try to use the following code to convert the bytes back to a String using Charset, I have diamonds at the end:
byte[] testbytes = "abc123".getBytes(); // tried getBytes("UTF-8"/StandardCharsets.UTF_8) too
Charset charset = Charset.forName("UTF-8"); // ISO-8859-1 has no diamonds
CharBuffer charBuffer = charset.decode( ByteBuffer.wrap( Arrays.copyOfRange(testbytes,0,testbytes.length) ) );
System.out.println("converted = " + String.valueOf(charBuffer.array()) );
// returns this - abc123����������
If I set the encoding to ISO-8859-1 instead, it converts fine. I thought it might be the encoding of the source code file but opening that in Notepad++ suggests it is also in UTF-8.
Am I missing something or is this just a problem with Android Studio's Logcat window?
- Edit 1 -
Further testing shows that 3 character strings do not have this padding at the end problem. If you use longer strings, Charset.decode seems to pad out the char array with \u0000 values according to the break point.
String.valueOf will end up printing the padded characters as diamonds while creating a new String object removes the padding but, I would like to not use String at all to convert a byte array to a char array due to sensitive values.
- Edit 2 -
It appears the above happens if you call charset.decode() again so, I'm guessing there's a buffer that's being appended to but not sure at what point. Tried clearing with charBuffer.clear() but the second block of code's output appears to be the same i.e. 3 char + 2 spaces + 6 new chars.
String test1 = "123";
byte[] test1b = test1.getBytes();
char[] expected1 = test1.toCharArray();
CharBuffer charBuffer = charset.decode( ByteBuffer.wrap( test1b ) );
char[] actual1 = charBuffer.array(); // size 3, correct
String test2 = "123456";
byte[] test2b = test2.getBytes();
char[] expected2 = test2.toCharArray();
CharBuffer charBuffer2 = charset.decode( ByteBuffer.wrap( test2b ) );
char[] actual2 = charBuffer2.array(); // size 11, padded with '\u0000' 0
Did you try to use the String constructor that receives an array of bytes?
Like:
byte[] testbytes = "abc123".getBytes(StandardCharsets.UTF_8);
String stringDecoded = new String(testbytes, StandardCharsets.UTF_8);
Maybe it can solve your problem.

decode base64 utf-8 string java

I have this string
"=?UTF-8?B?VGLNBGNDQA==?="
to decode in a standard java String.
I wrote this quick and dirty main to get the String, but I'm having troubles
String s = "=?UTF-8?B?VGLNBGNDQA==?=";
s = s.split("=\\?UTF-8\\?B\\?")[1].split("\\?=")[0];
System.out.println(s);
byte[] decoded = Base64.getDecoder().decode(s);
String x = new String(decoded, "UTF8");
System.out.println(decoded);
System.out.println(x);
It is actually printing a strange string
"Tb�cC#"
I do not know what is the text behind the encoded string, but I can assume my program works, since I can convert without problems any other encoded string, for example
"=?UTF-8?B?SGlfR3V5cyE="
That is "Hi_Guys!".
Should I assume that string is malformed?

How to read/write extended ASCII characters as a string into ANSI coded text file in java

This is my encryption program. Primarily used to encrypt Files(text)
This part of the program converts List<Integer> elements intobyte [] and writes it into a text file. Unfortunately i cannot provide the algorithm.
void printit(List<Integer> prnt, File outputFile) throws IOException
{
StringBuilder building = new StringBuilder(prnt.size());
for (Integer element : prnt)
{
int elmnt = element;
//building.append(getascii(elmnt));
building.append((char)elmnt);
}
String encryptdtxt=building.toString();
//System.out.println(encryptdtxt);
byte [] outputBytes = offo.getBytes();
FileOutputStream outputStream =new FileOutputStream(outputFile);
outputStream.write(outputBytes);
outputStream.close();
}
This is the decryption program where the decryption program get input from a .enc file
void getfyle(File inputFile) throws IOException
{
FileInputStream inputStream = new FileInputStream(inputFile);
byte[] inputBytes = new byte[(int)inputFile.length()];
inputStream.read(inputBytes);
inputStream.close();
String fylenters = new String(inputBytes);
for (char a:fylenters.toCharArray())
{
usertext.add((int)a);
}
for (Integer bk : usertext)
{
System.out.println(bk);
}
}
Since the methods used here, in my algorithm require List<Integer> byte[] gets converted to String first and then to List<Integer>and vice versa.
The elements while writing into a file during encryption do not match the elements read from the .enc file.
Is my method of converting List<Integer> to byte[] correct??
or is something else wrong? . I do know that java can't print extended ASCII characters so i used this .But, even this failed.It gives a lot of ?s
Is there a solution??
please help me .. and also how to do it for other formats(.png.mp3....etc)
The format of the encrypted file can be anything (it needn't be .enc)
thanxx
There are thousands of different 'extended ASCII' codes and Java supports about a hundred of them,
but you have to tell it which 'Charset' to use or the default often causes data corruption.
While representing arbitrary "binary" bytes in hex or base64 is common and often necessary,
IF the bytes will be stored and/or transmitted in ways that preserve all 256 values, often called "8-bit clean",
and File{Input,Output}Stream does, you can use "ISO-8859-1" which maps Java char codes 0-255 to and from bytes 0-255 without loss, because Unicode is based partly on 8859-1.
on input, read (into) a byte[] and then new String (bytes, charset) where charset is either the name "ISO-8859-1"
or the java.nio.charset.Charset object for that name, available as java.nio.charset.StandardCharSets.ISO_8859_1;
or create an InputStreamReader on a stream reading the bytes from a buffer or directly from the file, using that charset name or object, and read chars and/or a String from the Reader
on output, use String.getBytes(charset) where charset is that charset name or object and write the byte[];
or create an OutputStreamWriter on a stream writing the bytes to a buffer or the file, using that charset name or object, and write chars and/or String to the Writer
But you don't actually need char and String and Charset at all. You actually want to write a series of Integers as bytes, and read a series of bytes as Integers. So just do that:
void printit(List<Integer> prnt, File outputFile) throws IOException
{
byte[] outputBytes = new byte[prnt.size()]; int i = 0;
for (Integer element : prnt) outputBytes[i++] = (byte)element;
FileOutputStream outputStream =new FileOutputStream(outputFile);
outputStream.write(b);
outputStream.close();
// or replace the previous three lines by one
java.nio.file.Files.write (outputFile.toPath(), outputBytes);
}
void getfyle(File inputFile) throws IOException
{
FileInputStream inputStream = new FileInputStream(inputFile);
byte[] inputBytes = new byte[(int)inputFile.length()];
inputStream.read(inputBytes);
inputStream.close();
// or replace those four lines with
byte[] inputBytes = java.nio.file.Files.readAllBytes (inputFile.toPath());
for (byte b: inputBytes) System.out.println (b&0xFF);
// or if you really wanted a list not just a printout
ArrayList<Integer> list = new ArrayList<Integer>(inputBytes.length);
for (byte b: inputBytes) list.add (b&0xFF);
// return list or store it or whatever
}
Arbitrary data bytes are not all convertible to any character encoding and encryption creates data bytes including all values 0 - 255.
If you must convert the encrypted data to a string format the standard methods are to convert to Base64 or hexadecimal.
In encryption part:
`for (Integer element : prnt)
{
int elmnt = element;
//building.append(getascii(elmnt));
char b = Integer.toString(elmnt).charAt(0);
building.append(b);
}`
-->this will convert int to char like 1 to '1' and 5 to '5'

Convert String representation of bytes to byte[] in java

My application get the String representation of bytes. I need to convert it byte[] array. I am using below code but it is not working.
byte[] bytesArray = myString.getBytes();
Can anyone help what is the correct way to convert it to byte[].
EDIT:
hi all, My code is here http://pastebin.com/87jGprtD/. I have one base64 code. This base64 has content for text and imagedata both. I want to download/create an image from this code. When I decode I get the byte[] for both text and imagedata. I convert it string because I have to differentiate the each part. I used spilt with some delimiter now i have an array of string. This string contains the imagedata. I have to convert it back to bytes to create an image. please check code for the same. please
Here is the relevant code:
byte[] imageByteArray = Base64.decodeBase64(imageDataString);
System.out.println(new String(imageByteArray));
String[] contentArray = new String(imageByteArray).split("--1_520B30B0_E358708");
for (int i = 0; i < contentArray.length; i++) {
if (i == 2) {
String[] parts = contentArray[i].split("binary");
InputStream is = new ByteArrayInputStream((parts[1].trim()).getBytes());
ImageInputStream iis = ImageIO.createImageInputStream(is);
System.out.println(iis);
image = ImageIO.read(iis);
ImageIO.write(image, "JPG", new File("E:/test1.JPG"));
}
}
You are decoding Base64 data into byte[], then converting that to String. You can't do that -- "binary" data cannot be converted to String and back to "binary" without data loss.

Remove Non-Ansi Chars from a UTF String and Keep Others

We have a java lib accpeting a UTF8 string as the input. But if there is any char which is a non-ansi char in the input, the lib may crash. So, we want to remove all non-ansi char from the string. But how to do that in java?
Thanks,
Try this, I pulled this from here so haven't tested it
// Create a encoder and decoder for the character encoding
Charset charset = Charset.forName("US-ASCII");
CharsetDecoder decoder = charset.newDecoder();
CharsetEncoder encoder = charset.newEncoder();
// This line is the key to removing "unmappable" characters.
encoder.onUnmappableCharacter(CodingErrorAction.IGNORE);
String result = inString;
try {
// Convert a string to bytes in a ByteBuffer
ByteBuffer bbuf = encoder.encode(CharBuffer.wrap(inString));
// Convert bytes in a ByteBuffer to a character ByteBuffer and then to a string.
CharBuffer cbuf = decoder.decode(bbuf);
result = cbuf.toString();
} catch (CharacterCodingException cce) {
String errorMessage = "Exception during character encoding/decoding: " + cce.getMessage();
cce.printStackTrace()
}
Take a look at String.codePointAt(index). That can give you the Unicode code point for a given character, and from there you could remove those outside your range.
How you handle the fact that a character has been removed is on your end, but keep in mind that the string you'll be sending to the library isn't necessarily the same as that provided by the client. This may or may not cause problems.
I'm not sure what you mean by ANSI here. Do you mean the Windows 1252 character encoding that people typically call ANSI? That's not ASCII and it's also not IS0-8859-1, so make sure you get your code pages correct.

Categories