Response has 2 bytes per character in Netty

Response has 2 bytes per character in Netty - java

In Netty, I create a response by feeding a String in body:
DefaultFullHttpResponse res = new DefaultFullHttpResponse(HTTP_1_1, httpResponse.getHttpResponseStatus());
if (body != null) {
ByteBuf buf = Unpooled.copiedBuffer(body, CharsetUtil.UTF_8);
res.content().writeBytes(buf);
buf.release();
res.headers().set(HttpHeaderNames.CONTENT_LENGTH, res.content().readableBytes());
}
When I look at the response, I see content-length being twice the length of the characters in the String. I understand the Java String contains 2 bytes per character, but I can't figure out how to prevent this in Netty when returning the request.
When I look at Cloudflare responses, these contain one byte per character. So there must be a way to change this. Ideas?

As #Chris O'Toole shows in How to convert a netty ByteBuf to a String and vice versa we must
first convert the String to Byte Array using the desired charset (UTF-8 works fine) String.getBytes(Charset),
then Unpooled.wrappedBuffer(byte[]) using the Byte Array instead the
String.
One byte per character for most characters, as #rossum stated.

Use US_ASCII charset instead of UTF-8. Haven't tested, try.

Related

Decoding String (from header) encoded by Base64 and RFC2047 in Java

I'm working on a function to decode a string (from a header) that is encoded in both Base64 and RFC2047 in Java.
Given this header:
SGVhZGVyOiBoZWFkZXJ2YWx1ZQ0KQmFkOiBOYW1lOiBiYWRuYW1ldmFsdWUNClVuaWNvZGU6ID0/VVRGLTg/Qj81YmV4NXF5eTU2dUw2SUNNNTZ1TDVMcTY3N3lNNWJleDVxeXk2WUdVNklDTTZZR1U/PSA9P1VURi04P0I/NUxxNjc3eU01YmV4NW9tQTVMaU41cXl5Nzd5TTVZdS81cGE5NXBhODVMcTY0NENDPz0NCg0K
My expected output is:
Header: headervalue Bad: Name: badnamevalue Unicode:
己欲立而立人，己欲達而達人，己所不欲，勿施於人。
The only relevant function that I have found and tried was Base64.decodeBase64(headers), which produces this when printed out:
Header: headervalue Bad: Name: badnamevalue Unicode:
=?UTF-8?B?5bex5qyy56uL6ICM56uL5Lq677yM5bex5qyy6YGU6ICM6YGU?= =?UTF-8?B?5Lq677yM5bex5omA5LiN5qyy77yM5Yu/5pa95pa85Lq644CC?=
To solve this, I've been trying MimeUtility.decode() by converting the byte array returned from Base64.decodeBase64(headers) to InputStream, but the result was identical as above.
InputStream headerStream = new ByteArrayInputStream(Base64.decodeBase64(headers));
InputStream result = MimeUtility.decode(headerStream, "quoted-printable");
Have been searching around the internet but have yet found a solution, wondering if anyone knows ways to decode MIME headers from resulted byte arrays?
Any help is appreciated! It's also my first stack overflow post, apologies if I'm missing anything but please let me know if there's more information that I can provide!

The base64 you have there actually is what you pasted. Including the bizarre =?UTF-8?B? weirdness.
The stuff that follows is again base64.
There's base64-encoded data inside your base-64 encoded data. As Xzibit would say: I put some Base64 in your base64 so you can base64 while you base64. Why do I feel old all of a sudden?
In other words, the base64 input you get is a crazy, extremely inefficient format invented by a crazy person.
My advice is that you tell them to come up with something less insane.
Failing that:
Search the resulting string for the regex pattern and then again apply base64 decode to the stuff in the middle.
Also, you're using some third party base64 decoder, probably apache. Apache libraries tend to suck. Base64 is baked into java, there is no reason to use worse libraries here. I've fixed that; the Base64 in this snippet is java.util.Base64. Its API is slightly different.
String sourceB64 = "SGV..."; // that input base64 you have.
byte[] sourceBytes = Base64.decodeBase64(sourceB64);
String source = new String(sourceBytes, StandardCharsets.UTF_8);
Pattern p = Pattern.compile("=\\?UTF-8\\?B\\?(.*?)\\?=");
Matcher m = p.matcher(source);
StringBuilder out = new StringBuilder();
int curPos = 0;
while (m.find()) {
out.append(source.substring(curPos, m.start()));
curPos = m.end();
String content = new String(Base64.getDecoder().decode(m.group(1)), StandardCharsets.UTF_8);
out.append(content);
}
out.append(source.substring(curPos));
System.out.println(out.toString());
If I run that, I get:
Header: headervalue
Bad: Name: badnamevalue
Unicode: 己欲立而立人，己欲達而達 人，己所不欲，勿施於人。
Which looks exactly like what you want.
Explanation of that code:
It first base64-decodes the input, and turns that into a string. (Your idea of using InputStream is a red herring. That doesn't help at all here. You just want to turn bytes into a string, you do it as per line 3 of that snippet. Pass the byte array and the encoding those bytes are in, that's all you need to do).
It then goes on the hunt for =?UTF-8?B?--base64here--?= inside your base64. The base64-in-the-base64.
It then decoder that base64, turns it into a string in the same fashion, and replaces it.
It just adds everything besides those =?UTF-8?B?...?= segments verbatim.

How to get rid of incorrect symbols during Java NIO decoding?

I need to read a text from file and, for instance, print it in console. The file is in UTF-8. It seems that I'm doing something wrong because some russian symbols are printed incorrectly. What's wrong with my code?
StringBuilder content = new StringBuilder();
try (FileChannel fChan = (FileChannel) Files.newByteChannel(Paths.get("D:/test.txt")) ) {
ByteBuffer byteBuf = ByteBuffer.allocate(16);
Charset charset = Charset.forName("UTF-8");
while(fChan.read(byteBuf) != -1) {
byteBuf.flip();
content.append(new String(byteBuf.array(), charset));
byteBuf.clear();
}
System.out.println(content);
}
The result:
Здравствуйте, как поживае��е?
Это п��имер текста на русском яз��ке.ом яз�
The actual text:
Здравствуйте, как поживаете?
Это пример текста на русском языке.

UTF-8 uses a variable number of bytes per character. This gives you a boundary error: You have mixed buffer-based code with byte-array based code and you can't do that here; it is possible for you to read enough bytes to be stuck halfway into a character, you then turn your input into a byte array, and convert it, which will fail, because you can't convert half a character.
What you really want is either to first read ALL the data and then convert the entire input, or, to keep any half-characters in the bytebuffer when you flip back, or better yet, ditch all this stuff and use code that is written to read actual characters. In general, using the channel API complicates matters a ton; it's flexible, but complicated - that's how it goes.
Unless you can explain why you need it, don't use it. Do this instead:
Path target = Paths.get("D:/test.txt");
try (var reader = Files.newBufferedReader(target)) {
// read a line at a time here. Yes, it will be UTF-8 decoded.
}
or better yet, as you apparently want to read the whole thing in one go:
Path target = Paths.get("D:/test.txt");
var content = Files.readString(target);
NB: Unlike most java methods that convert bytes to chars or vice versa, the Files API defaults to UTF-8 (instead of the useless and dangerous, untestable-bug-causing 'platform default encoding' that most java API does). That's why this last incredibly simple code is nevertheless correct.

Java UTF-8 not working properly for JSON

I am using spring rest template for getting rest API.
When I try to print the output, I get unwanted Characters.
Here is my code:
RestTemplate restTemplate = new RestTemplate();
ResponseEntity<String> apiResponse = restTemplate.getForEntity(url,String.class);
return apiResponse.getBody();
Output is:
ï»¿{"status":"FAILURE","error_code":"ITI","message":"Invalid Transaction Id","time":"30-03-2017 11:47:32"}
After getting this error I added UTF-8 Charecter encoding in the rest client:
public static String exicute(String url) {
RestTemplate restTemplate = new RestTemplate();
restTemplate.getMessageConverters().add(0, new StringHttpMessageConverter(Charset.forName("utf-8")));
ResponseEntity<String> apiResponse = restTemplate.getForEntity(url,String.class);
return apiResponse.getBody();
}
After that OUTPUT GOT changed but now ? in front of the result.
?{"status":"FAILURE","error_code":"ITI","message":"Invalid Transaction Id","time":"30-03-2017 11:49:34"}
How can i solve this issue?

ï»¿ in front of the message is because the input stream has a byte order mark (BOM) at the beginning of the stream. The byte order mark is a Unicode character often at the beginning of the byte sequence which signals that the following bytes are encoded in UTF-8.
The character itself is often encoded as UTF-8 as well. It is then encoded as 0xEF,0xBB,0xBF, and it is often displayed as ï»¿.
its only use in UTF-8 is to signal at the start that the text stream is encoded in UTF-8
That character is actually not part of the contents itself; instead it is merely a piece of metadata.
How to fix it?
The creator of the byte sequence (often a file, but it can also be some byte stream over the network) should remove it, in my opinion.
But on the other hand, you can easily remove it by replacing the character by an empty string.
string.replace("\uFEFF", "");
code piece copied from this post

Input byte array has incorrect ending byte at 40

I have a string that is base64 encoded. It looks like this:
eyJibGExIjoiYmxhMSIsImJsYTIiOiJibGEyIn0=
Any online tool can decode this to the proper string which is {"bla1":"bla1","bla2":"bla2"}. However, my Java implementation fails:
import java.util.Base64;
System.out.println("payload = " + payload);
String json = new String(Base64.getDecoder().decode(payload));
I'm getting the following error:
payload = eyJibGExIjoiYmxhMSIsImJsYTIiOiJibGEyIn0=
java.lang.IllegalArgumentException: Input byte array has incorrect ending byte at 40
What is wrong with my code?

Okay, I found out. The original String is encoded on an Android device using android.util.Base64 by Base64.encodeToString(json.getBytes("UTF-8"), Base64.DEFAULT);. It uses android.util.Base64.DEFAULT encoding scheme.
Then on the server side when using java.util.Base64 this has to be decoded with Base64.getMimeDecoder().decode(payload) not with Base64.getDecoder().decode(payload)

I was trying to use the strings from the args. I found that if I use arg[0].trim() that it made it work. eg
Base64.getDecoder().decode(arg[0].trim());
I guess there's some sort of whitespace that gets it messed up.

Maybe too late, but I also had this problem.
By default, the Android Base64 util adds a newline character to the end of the encoded string.
The Base64.NO_WRAP flag tells the util to create the encoded string without the newline character.
Your android app should encode src something like this:
String encode = Base64.encodeToString(src.getBytes(), Base64.NO_WRAP);

Why I am not able send header with /(forward slash) like character?

I am getting cookie value in jstring. Server is sending it as base64encoded UTF8 string. I compared string from server and my end, and I am getting exactly same string.
Now I need to decorate this value with n= as prefix and ; as suffix. (Which I am doing in line no. 2 of code).
If I do not use line no. 1, string goes null to Java Server. Otherwise server is getting value.
jstring = [jstring stringByAddingPercentEscapesUsingEncoding:NSUTF8StringEncoding];
NSString *cookieVal=[NSString stringWithFormat:#"n=%#%#",jstring,#";"];
[self.requestSerializer setValue:cookieVal forHTTPHeaderField:#"Cookie"];
We are using AFNetworking in iOS for request and response. We have observed very strange pattern,
If string contains /(forward slash) then we are getting padding error on Java server, if string doesn't contain /, then string will go as required.
As you can see in line no. 3, we are sending this value as header of http/https request.
I have tried many things, like this (tried very last code with my string.). Also, tried to use different encoding, but problem still persists.

This url conversion would not convert all the special characters we have in ios device keypad.
we have to convert this with blow function. use this as category.
- (NSString *) URLEncodedString_ch {
return (NSString *)CFBridgingRelease(CFURLCreateStringByAddingPercentEscapes(NULL, (CFStringRef)self, NULL, (CFStringRef)#"!*'\"();:#&=+$,/?%#[]%~_. ", CFStringConvertNSStringEncodingToEncoding(NSUTF8StringEncoding)));
}

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Response has 2 bytes per character in Netty - java

Use US_ASCII charset instead of UTF-8. Haven't tested, try.

Related

Decoding String (from header) encoded by Base64 and RFC2047 in Java

How to get rid of incorrect symbols during Java NIO decoding?

Java UTF-8 not working properly for JSON

Input byte array has incorrect ending byte at 40

Why I am not able send header with /(forward slash) like character?

Categories

Resources