Java+HtmlUnit — problem with cyrillic urlencode - java

I am trying to send some HTTP POST parameters to some web server and one of parameters contains cyrillic characters. So the problem is that if I use this code:
wc.getPage(requestSettings);
requestSettings.setHttpMethod(HttpMethod.POST);
requestSettings.setRequestParameters(new ArrayList());
requestSettings.getRequestParameters().add(new NameValuePair("username", "Друже бобер"));
wc.getPage(requestSettings);
Server will recieve the next urlencoded parameter:
And this is wrong decoded string "Друже бобер".
So I think that HtmlUnit encode url in core with using ASCII not Unicode. How to disable url encoding or how to fix this bug? If I'll encode this string and set to NameValuePair so all percent characters will be encoded by HtmlUnit to.

I think you need to set the charset using the setCharset method.

Related

Spring Android: how to post form urlencoded to server using percent-encoding

How can i configure RestTemplate (Springframework) to encode using percent-encoding rather characters encoding, for example i am posting this parameters to a server:
client_id=xxx
client_secret=xxx
grant_type=client_credentials
scope=public_read registration
but when posting, spring send it as:
client_id=xxx&client_secret=xxx&grant_type=client_credentials&scope=public_read+registration
and i want it to be like that:
client_id=xxx&client_secret=xxx&grant_type=client_credentials&scope=public_read%20registration
it converts spaces to + and i want it to be %20
thx
You can use this:
String formated_urlString = URLEncoder.encode(unformated_url_string, "utf-8").replace("+", "%20").
Also see
URLEncoder not able to translate space character

java servlet: request parameter contains plus

The request parameter is like decrypt?param=5FHjiSJ6NOTmi7/+2tnnkQ==.
In the servlet, when I try to print the parameter by String param = request.getParameter("param"); I get 5FHjiSJ6NOTmi7/ 2tnnkQ==. It turns the character + into a space. How can I keep the orginal paramter or how can I properly handle the character +.
Besides, what else characters should I handle?
You have two choices
URL encode the parameter
If you have control over the generation of the URL you should choose this. If not...
Manually retrieve the parameter
If you can't change how the URL is generated (above) then you can manually retrieve the raw URL. Certain methods decode parameters for you. getParameter is one of them. On the other hand, getQueryString does not decode the String. If you have only a few parameters it shouldn't be difficult to parse the value yourself.
request.getQueryString();
//?param=5FHjiSJ6NOTmi7/+2tnnkQ==
If you want to use the '+' character in a URL you need to encode it when it is generated. For '+' the correct encoding is %2b
Use URLEncoder,URLDecoder's static methods for encoding and decoding URLs.
For example : -
Encode the URL param using
URLEncoder.encode(url,"UTF-8")
Back in the server side , decode this parameter using
URLDecoder.decode(url,"UTF-8")
decode method returns a String type of the decoded URL.
Allthough the question is some years old, I'd like to write down how I fixed the problem in my case: the download link to a file is created in a GWT page where
com.google.gwt.http.client.URL.encode(finalurl)
is used to encode the URL.
The problem was that the "+" sign a customer of us had in the filename wasn't encoded/escaped. So I had to remove the URL.encode(finalurl) and encode each parameter in the url with
URL.encodePathSegment(fileName)
I know my question is bound to GWT but it seems, URLEncoder.encode(string, encoding) should be applied to the parameter only aswell.

Unable to decode russing string with encodeURIComponent and java.net.decode

I have a russian string "этикетка". This is need to send to a web service, before sending to the web service i use encodeURIComponent to encode the string and it gives me:
'%D1%8D%D1%82%D0%B8%D0%BA%D0%B5%D1%82%D0%BA%D0%B0'
On the web service side is receive the string and decode it using the following code:
String strLbl = java.net.URLDecoder.decode(label);
but i don't get the string properly. It looses formatting and I get ѿтикетка.
Can you please suggest how can i overcome this or what is the ideal way to send russian string
Thanks and regards
As explained in the link given by NULL, decode(string) is now Deprecated in the favour of decode(string, encoding)
I would guess that the encoding and decoding method are not using the same page code.
Did you try to force UTF-8 during both process?
I misunderstood your question be the formatting of it.
Use decodeURIComponent to decode url encoded strings in JavaScript:
> decodeURIComponent('%D1%8D%D1%82%D0%B8%D0%BA%D0%B5%D1%82%D0%BA%D0%B0')
"этикетка";

url encoding skipping the fqdn

I have a question regarding url encoding. Trying to encode the url and could not get it working. Tried java.net.URLEncode.
I have url http://msnbcmedia4.msn.com/i/MSNBC/Components/Photo/_new/130409_luke hancock.jpg and I need to encode it. From online forums my understanding is that I should only encode queryparams and url path excluding fqdn(http://msnbcmedia4.msn.com). Should I need to encode(/ in url path, ? and & in parameters) or skip encoding these. I am trying to download the content from this specific location using java. Any info would be appreciated.
URLEncoder is the right choice. You need to encode only individual Query string parameters name/value and not the entire URL. If you encode whole URL then it will encode Http and other URL parts as well which we don't want.
Check out this awesome answer >> https://stackoverflow.com/a/10786112/2093375
Regards,

response.sendredirect with url with foreign chars - how to encode?

I have a jsf app that has international users so form inputs can have non-western strings like kanjii and chinese - if I hit my url with ..?q=東日本大 the output on the page is correct and I see the q input in my form gets populated fine. But if I enter that same string into my form and submit, my app does a redirect back to itself after constructing the url with the populated parameters in the url (seems redundant but this is due to 3rd party integration) but the redirect is not encoding the string properly. I have
url = new String(url.getBytes("ISO-8859-1"), "UTF-8");
response.sendRedirect(url);
But url redirect ends up being q=???? I've played around with various encoding strings (switched around ISO and UTF-8 and just got a bunch of gibberish in the url) in the String constructor but none seem to work to where I get q=東日本大 Any ideas as to what I need to do to get the q=東日本大 populated in the redirect properly? Thanks.
How are you making your url? URIs can't directly have non-ASCII characters in; they have to be turned into bytes (using a particular encoding) and then %-encoded.
URLEncoder.encode should be given an encoding argument, to ensure this is the right encoding. Otherwise you get the default encoding, which is probably wrong and always to be avoided.
String q= "\u6771\u65e5\u672c\u5927"; // 東日本大
String url= "http://example.com/query?q="+URLEncoder.encode(q, "utf-8");
// http://example.com/query?q=%E6%9D%B1%E6%97%A5%E6%9C%AC%E5%A4%A7
response.sendRedirect(url);
This URI will display as the IRI ‘http://example.com/query?q=東日本大’ in the browser address bar.
Make sure you're serving your pages as UTF-8 (using Content-Type header/meta) and interpreting query string input as UTF-8 (server-specific; see this faq for Tomcat.)
Try
response.setContentType("text/html; charset=UTF-16");
response.setCharacterEncoding("utf-16");

Categories