I have a question regarding url encoding. Trying to encode the url and could not get it working. Tried java.net.URLEncode.
I have url http://msnbcmedia4.msn.com/i/MSNBC/Components/Photo/_new/130409_luke hancock.jpg and I need to encode it. From online forums my understanding is that I should only encode queryparams and url path excluding fqdn(http://msnbcmedia4.msn.com). Should I need to encode(/ in url path, ? and & in parameters) or skip encoding these. I am trying to download the content from this specific location using java. Any info would be appreciated.
URLEncoder is the right choice. You need to encode only individual Query string parameters name/value and not the entire URL. If you encode whole URL then it will encode Http and other URL parts as well which we don't want.
Check out this awesome answer >> https://stackoverflow.com/a/10786112/2093375
Regards,
Related
I am using HttpClient, its working fine for any url having no special characters.
But when i send the url having special characters it gets failed.
I tried URL Api but it is deprecated.
Tried with utf-8 but also did not work.
Can you suggest me a simple way of making the HttpGet call for below url
http://example.com/?status!~^(notdeleted|presesnt)$&env~check_test
String link = "http://example.com/?"
+ URLEncoder.encode("status!~^(notdeleted|presesnt)$&env~check_test", "UTF-8");
Maybe in two parts around & if that is meant as the next URL parameter.
I got the problem with diffbot url encode problem.
I have a URL and I pass url when I call diffbot api like this.
//JsonNode json= (JsonNode)client.analyze(DiffbotClient.ResponseType.Jackson,url);
but I got error massage about url encoding.this is error message that I got
{"errorCode":500,"error":"URL encoding"}
So I change my code system like this.
//JsonNode json= (JsonNode) client.analyze(DiffbotClient.ResponseType.Jackson,u.getHost()+u.getPath()+URLEncoder.encode("?"+u.getQuery(),"UTF-8"));
but it doesn't work out and Diffbot print like that
{"errorCode":500,"error":"Error."}.
what kind of Encoding format diffbot API is using?
You're supposed to only encode the URL the contents of which you're processing with Diffbot, not the entire API string. For example, replace {{token}} below with your own and visit the URL in the browser. It will work.
Use this as inspiration to build your own URL for the API call:
http://api.diffbot.com/v3/article?token={{token}}&url=http%3A%2F%2Fwww.sitepoint.com%2Fdiffbot-crawling-visual-machine-learning%2F
As you can see, only the url query param is encoded, and it's no special encoding, it's just basic HTML entity encoding.
The request parameter is like decrypt?param=5FHjiSJ6NOTmi7/+2tnnkQ==.
In the servlet, when I try to print the parameter by String param = request.getParameter("param"); I get 5FHjiSJ6NOTmi7/ 2tnnkQ==. It turns the character + into a space. How can I keep the orginal paramter or how can I properly handle the character +.
Besides, what else characters should I handle?
You have two choices
URL encode the parameter
If you have control over the generation of the URL you should choose this. If not...
Manually retrieve the parameter
If you can't change how the URL is generated (above) then you can manually retrieve the raw URL. Certain methods decode parameters for you. getParameter is one of them. On the other hand, getQueryString does not decode the String. If you have only a few parameters it shouldn't be difficult to parse the value yourself.
request.getQueryString();
//?param=5FHjiSJ6NOTmi7/+2tnnkQ==
If you want to use the '+' character in a URL you need to encode it when it is generated. For '+' the correct encoding is %2b
Use URLEncoder,URLDecoder's static methods for encoding and decoding URLs.
For example : -
Encode the URL param using
URLEncoder.encode(url,"UTF-8")
Back in the server side , decode this parameter using
URLDecoder.decode(url,"UTF-8")
decode method returns a String type of the decoded URL.
Allthough the question is some years old, I'd like to write down how I fixed the problem in my case: the download link to a file is created in a GWT page where
com.google.gwt.http.client.URL.encode(finalurl)
is used to encode the URL.
The problem was that the "+" sign a customer of us had in the filename wasn't encoded/escaped. So I had to remove the URL.encode(finalurl) and encode each parameter in the url with
URL.encodePathSegment(fileName)
I know my question is bound to GWT but it seems, URLEncoder.encode(string, encoding) should be applied to the parameter only aswell.
I am trying to send some HTTP POST parameters to some web server and one of parameters contains cyrillic characters. So the problem is that if I use this code:
wc.getPage(requestSettings);
requestSettings.setHttpMethod(HttpMethod.POST);
requestSettings.setRequestParameters(new ArrayList());
requestSettings.getRequestParameters().add(new NameValuePair("username", "Друже бобер"));
wc.getPage(requestSettings);
Server will recieve the next urlencoded parameter:
And this is wrong decoded string "Друже бобер".
So I think that HtmlUnit encode url in core with using ASCII not Unicode. How to disable url encoding or how to fix this bug? If I'll encode this string and set to NameValuePair so all percent characters will be encoded by HtmlUnit to.
I think you need to set the charset using the setCharset method.
I have a jsf app that has international users so form inputs can have non-western strings like kanjii and chinese - if I hit my url with ..?q=東日本大 the output on the page is correct and I see the q input in my form gets populated fine. But if I enter that same string into my form and submit, my app does a redirect back to itself after constructing the url with the populated parameters in the url (seems redundant but this is due to 3rd party integration) but the redirect is not encoding the string properly. I have
url = new String(url.getBytes("ISO-8859-1"), "UTF-8");
response.sendRedirect(url);
But url redirect ends up being q=???? I've played around with various encoding strings (switched around ISO and UTF-8 and just got a bunch of gibberish in the url) in the String constructor but none seem to work to where I get q=東日本大 Any ideas as to what I need to do to get the q=東日本大 populated in the redirect properly? Thanks.
How are you making your url? URIs can't directly have non-ASCII characters in; they have to be turned into bytes (using a particular encoding) and then %-encoded.
URLEncoder.encode should be given an encoding argument, to ensure this is the right encoding. Otherwise you get the default encoding, which is probably wrong and always to be avoided.
String q= "\u6771\u65e5\u672c\u5927"; // 東日本大
String url= "http://example.com/query?q="+URLEncoder.encode(q, "utf-8");
// http://example.com/query?q=%E6%9D%B1%E6%97%A5%E6%9C%AC%E5%A4%A7
response.sendRedirect(url);
This URI will display as the IRI ‘http://example.com/query?q=東日本大’ in the browser address bar.
Make sure you're serving your pages as UTF-8 (using Content-Type header/meta) and interpreting query string input as UTF-8 (server-specific; see this faq for Tomcat.)
Try
response.setContentType("text/html; charset=UTF-16");
response.setCharacterEncoding("utf-16");