how to replace brackets in url with bracket encoding? - java

I need a regex pattern that will find and replace brackets in urls to its urls encoding.
For example a base url like:
http://www.mysite.com/bla/blabla/abc[1].txt
will be turned to:
http://www.mysite.com/bla/blabla/abc%5B1%5D.txt
can anyone help please?
EDIT1:
i originaly use commons-httpclient to access this kind of urls.
when I use the first URL I get an "escaped absolute path no valid" exception.
I can't use URLENCODER because when I use it, I get a "host parameter is null" exception.

The following line should do the trick
String s = URLEncoder.encode("http://www.mysite.com/bla/blabla/abc[1].txt", "UTF-8");

Have you tried URLEncoder.encode?
in the java.net.URLEncoder package.
EDIT:
Ok i see... you cannot pass an entire URL to URLEncoder. URLEncoder is mostly used to encode query parameters.
try this instead:
URI uri = new URI("http", "www.mysite.com", "/bla/blabla/abc[1].txt",null);
System.out.println(uri.toASCIIString());

Related

How to use a Url having special characters in HttpGet(URL) in java

I am using HttpClient, its working fine for any url having no special characters.
But when i send the url having special characters it gets failed.
I tried URL Api but it is deprecated.
Tried with utf-8 but also did not work.
Can you suggest me a simple way of making the HttpGet call for below url
http://example.com/?status!~^(notdeleted|presesnt)$&env~check_test
String link = "http://example.com/?"
+ URLEncoder.encode("status!~^(notdeleted|presesnt)$&env~check_test", "UTF-8");
Maybe in two parts around & if that is meant as the next URL parameter.

URL percent encoding query param Bing API Java

I'm trying to URL percent encode my query param value while using URIBuilder to make an HTTP request to Bing API.
The url looks like
"https://api.datamarket.azure.com/Data.ashx/Bing/SearchWeb/v1/Web?$format=json&Query="
Where the Query String must be like
%27Test%20query%27
Using URLEncoder.encode(string, code), a string such as "test query", gets turned into "test+query" which is unacceptable.
URIUtil.encodeQuery()
returns "test%20query" which is almost acceptable, except it needs the %27 at the beginning and end.
When I try to just concatenate the string to make it valid as such, and then load this into URIBuilder, URIBuilder ends up with
https://api.datamarket.azure.com/Data.ashx/Bing/SearchWeb/v1/Web?%24format=json&Query=%2527test%2520query%2527
which is again unacceptable.
How can I remedy this issue? It's driving me insane.
Thanks for any help.
this is encoded URI.
$ is %24
bank is %20
if you want real URI, you need to decode .
I think decode method works well for you.
reference here:
http://hc.apache.org/httpclient-3.x/apidocs/org/apache/commons/httpclient/util/URIUtil.html

url encoding skipping the fqdn

I have a question regarding url encoding. Trying to encode the url and could not get it working. Tried java.net.URLEncode.
I have url http://msnbcmedia4.msn.com/i/MSNBC/Components/Photo/_new/130409_luke hancock.jpg and I need to encode it. From online forums my understanding is that I should only encode queryparams and url path excluding fqdn(http://msnbcmedia4.msn.com). Should I need to encode(/ in url path, ? and & in parameters) or skip encoding these. I am trying to download the content from this specific location using java. Any info would be appreciated.
URLEncoder is the right choice. You need to encode only individual Query string parameters name/value and not the entire URL. If you encode whole URL then it will encode Http and other URL parts as well which we don't want.
Check out this awesome answer >> https://stackoverflow.com/a/10786112/2093375
Regards,

Search and replace "/" at end of url's using regular expressions in java

Below is my regular expression :-
\\bhttps?://[-a-zA-Z0-9+&##/%?=~_|!:,.;]*[-a-zA-Z0-9+&##/%=~_|]\\b
when the request url is of type http://www.example.com/ , the last character is not replaced in my shortner url and / is appended at end.
The regex is not able to find the last /.
Please help with this.
I think that / would be a word boundary, so maybe it works better if you add a ? to the and, so it reads:
\\bhttps?://[-a-zA-Z0-9+&##/%?=~_|!:,.;]*[-a-zA-Z0-9+&##/%=~_|]\\b?
what about:
if(url.endsWith("/"))
url = url.substring(0,url.length()-1);
or if you need to use regular expressions you can do something like this:
url = url.replaceAll("(\\bhttps?://[-a-zA-Z0-9+&##/%?=~_|!:,.;]*)/(\\b?)","$1$2");
If all you want is to replace the trailing / (which is what your question directly asks), you can simply do:
url = url.substring(0, url.lastIndexOf('/'));
Remember to KISS often.
You could simply use:
url = url.replaceAll("\/+$","");

Encode URL query parameters

How can I encode URL query parameter values? I need to replace spaces with %20, accents, non-ASCII characters etc.
I tried to use URLEncoder but it also encodes / character and if I give a string encoded with URLEncoder to the URL constructor I get a MalformedURLException (no protocol).
URLEncoder has a very misleading name. It is according to the Javadocs used encode form parameters using MIME type application/x-www-form-urlencoded.
With this said it can be used to encode e.g., query parameters. For instance if a parameter looks like &/?# its encoded equivalent can be used as:
String url = "http://host.com/?key=" + URLEncoder.encode("&/?#");
Unless you have those special needs the URL javadocs suggests using new URI(..).toURL which performs URI encoding according to RFC2396.
The recommended way to manage the encoding and decoding of URLs is to use URI
The following sample
new URI("http", "host.com", "/path/", "key=| ?/#ä", "fragment").toURL();
produces the result http://host.com/path/?key=%7C%20?/%23ä#fragment. Note how characters such as ?&/ are not encoded.
For further information, see the posts HTTP URL Address Encoding in Java or how to encode URL to avoid special characters in java.
EDIT
Since your input is a string URL, using one of the parameterized constructor of URI will not help you. Neither can you use new URI(strUrl) directly since it doesn't quote URL parameters.
So at this stage we must use a trick to get what you want:
public URL parseUrl(String s) throws Exception {
URL u = new URL(s);
return new URI(
u.getProtocol(),
u.getAuthority(),
u.getPath(),
u.getQuery(),
u.getRef()).
toURL();
}
Before you can use this routine you have to sanitize your string to ensure it represents an absolute URL. I see two approaches to this:
Guessing. Prepend http:// to the string unless it's already present.
Construct the URI from a context using new URL(URL context, String spec)
So what you're saying is that you want to encode part of your URL but not the whole thing. Sounds to me like you'll have to break it up into parts, pass the ones that you want encoded through the encoder, and re-assemble it to get your whole URL.

Categories