bad encoding for xml - java

I have a string like this " <person name="peter" ><\person>"
URL encoding
URLEncoder.encode(person.toString(),"UTF-8");
but the encoding is bad because for spaces make + insted of %20 and for = he gives other values can you guys help me?

This is exactly as specified in the URLEncoder javaDoc. Space is converted to + and = is "unsafe" and thus encoded to %3D.
If you need a %20 instead of the +, just do some post processing:
URLEncoder.encode(person.toString(),"UTF-8").replace("+", "%20");

Considering your comment I assume you want to decode the webservice answer.
// the answer you receive from the webservice
string webserviceResponse = "%3Cperson+name%3D%22peter%22%3E%3C%2Fperson%3E";
// turn into a "good" Xml string
string person = URLDecoder.decode(webserviceResponse, "UTF-8");
will give you
<person name="peter"></person>
as the value of person.

Related

encode some key's value of String URL for HTTP Get Request

I need to encode only parameters of the string url.
my string url is like: http://127.0.0.1:8070/app/api/fetchData?channel=abc&param=status:new|addr:null|roomId:Default&group=iPh&reqtype=p1&serialNo=123890&codeId=A1_8uh&type=p
I want to encode value of param(key).I am working on spring boot project.
Please suggest some solution.
If you're making the request from Java, you can encode a string using base64 like this[1]:
String originalInput = "test input";
String encodedString = Base64.getEncoder().encodeToString(originalInput.getBytes());
I guess my first thought would be to encode the parameters that you want this way and then concatenate the whole thing together. What value are you trying to get out of encoding the parameters?
[1] https://www.baeldung.com/java-base64-encode-and-decode

Encode URL with US-ASCII character set

I refer to the following web site:
http://coderstoolbox.net/string/#!encoding=xml&action=encode&charset=us_ascii
Choosing "URL", "Encode", and "US-ASCII", the input is converted to the desired output.
How do I produce the same output with Java codes?
Thanks in advance.
I used this and it seems to work fine.
public static String encode(String input) {
Pattern doNotReplace = Pattern.compile("[a-zA-Z0-9]");
return input.chars().mapToObj(c->{
if(!doNotReplace.matcher(String.valueOf((char)c)).matches()){
return "%" + (c<256?Integer.toHexString(c):"u"+Integer.toHexString(c));
}
return String.valueOf((char)c);
}).collect(Collectors.joining("")).toUpperCase();
}
PS: I'm using 256 to limit the placement of the prefix U to non-ASCII characters. No need of prefix U for standard ASCII characters which are within 256.
Alternate option:
There is a built-in Java class (java.net.URLEncoder) that does URL Encoding. But it works a little differently (For example, it does not replace the Space character with %20, but replaces with a + instead. Something similar happens with other characters too). See if it helps:
String encoded = URLEncoder.encode(input, "US-ASCII");
Hope this helps!
You can use ESAPi.encoder().encodeForUrl(linkString)
Check more details on encodeForUrl https://en.wikipedia.org/wiki/Percent-encoding
please comment if that does not satisfy your requirement or face any other issue.
Thanks

Need to replace spaces inside string with percentual symbol Java

I need to replace the spaces inside a string with the % symbol but I'm having some issues, what I tried is:
imageUrl = imageUrl.replace(' ', "%20");
But It gives me an error in the replace function.
Then:
imageUrl = imageUrl.replace(' ', "%%20");
But It still gives me an error in the replace function.
The I tried with the unicode symbol:
imageUrl = imageUrl.replace(' ', (char) U+0025 + "20");
But it still gives error.
Is there an easy way to do it?
String.replace(String, String) is the method you want.
replace
imageUrl.replace(' ', "%");
with
imageUrl.replace(" ", "%");
System.out.println("This is working".replace(" ", "%"));
I suggest you to use a URL Encoder for Encoding Strings in java.
String searchQuery = "list of banks in the world";
String url = "http://mypage.com/pages?q=" + URLEncoder.encode(searchQuery, "UTF-8");
I've ran into issues like this in the past with certain frameworks. I don't have enough of your code to know for sure, but what might be happening is whatever http framework you are using, in my case it was spring, is encoding the URL again. I spent a few days trying to solve a similar problem where I thought that string replace and the URI.builder() was broken. What ended up being the problem was that my http framework had taken my encoded url, and encoded it again. that means that any place it saw a "%20", it would see the '%' charictor and switch it out for '%' http code, "%25", resulting in. "%2520". The request would then fail because %2520 didn't translate into the space my server was expecting. While the issue apeared to be one of my encoding not working, it was really an issue of encoding too many times. I have an example from some working code in one of my projects below
//the Url of the server
String fullUrl = "http://myapiserver.com/path/";
//The parameter to append. contains a space that will need to be encoded
String param 1 = "parameter 1"
//Use Uri.Builder to append parameter
Uri.Builder uriBuilder = Uri.parse(fullUrl).buildUpon();
uriBuilder.appendQueryParameter("parameter1",param1);
/* Below is where it is important to understand how your
http framework handles unencoded url. In my case, which is Spring
framework, the urls are encoded when performing requests.
The result is that a url that is already encoded will be
encoded twice. For instance, if you're url is
"http://myapiserver.com/path?parameter1=param 1"
and it needs to be read by the server as
"http://myapiserver.com/path?parameter1=param%201"
it makes sense to encode the url using URI.builder().append, or any valid
solutions listed in other posts. However, If the framework is already
encoding your url, then it is likely to run into the issue where you
accidently encode the url twice: Once when you are preparing the URL to be
sent, and once again when you are sending the message through the framework.
this results in sending a url that looks like
"http://myapiserver.com/path?parameter1=param%25201"
where the '%' in "%20" was replaced with "%25", http's representation of '%'
when what you wanted was
"http://myapiserver.com/path?parameter1=param%201"
this can be a difficult bug to squash because you can copy the url in the
debugger prior to it being sent and paste it into a tool like fiddler and
have the fiddler request work but the program request fail.
since my http framework was already encoding the urls, I had to unencode the
urls after appending the parameters so they would only be encoded once.
I'm not saying it's the most gracefull solution, but the code works.
*/
String finalUrl = uriBuilder.build().toString().replace("%2F","/")
.replace("%3A", ":").replace("%20", " ");
//Call the server and ask for the menu. the Menu is saved to a string
//rest.GET() uses spring framework. The url is encoded again as
part of the framework.
menuStringFromIoms = rest.GET(finalUrl);
There is likely a more graceful way to keep a url from encoding twice. I hope this example helps point you on the right direction or eliminate a possability. Good luck.
Try this:
imageUrl = imageUrl.replaceAll(" ", "%20");
Replace spaces is not enought, try this
url = java.net.URLEncoder.encode(url, "UTF-8");

Extract parameters from URL

I have problems with the character +(and maybe others) at the URIBuilder is suppose to get a decoded url but when I extract the query the + is replaced
String decodedUrl = "www.foo.com?sign=AZrhQaTRSiys5GZtlwZ+H3qUyIY=&more=boo";
URIBuilder builder = new URIBuilder(decodedUrl);
List<NameValuePair> params = builder.getQueryParams();
String sign = params.get(0).getValue();
the value of sing is AZrhQaTRSiys5GZtlwZ H3qUyIY= with a space instead +. How can I extract the correct value?
other way is:
URI uri = new URI(decodedUrl);
String query = uri.getQuery();
the value of query is sign=AZrhQaTRSiys5GZtlwZ+H3qUyIY=&more=boo in this case is correct, but I have to strip it. Is there another way to do that?
Use it differently:
String decodedUrl = "www.foo.com";
URIBuilder builder = new URIBuilder(decodedUrl);
builder.addParameter("sign", "AZrhQaTRSiys5GZtlwZ+H3qUyIY=");
builder.addParameter("more", "boo");
List<NameValuePair> params = builder.getQueryParams();
String sign = params.get(0).getValue();
addParameter method is responsible for adding parameters as to the builder. The constructor of the builder should include the base URL only.
If this URL is given to you as is, then the + is already decoded and stands for the space character. If you are the one who generates this URL, you probably skipped the URL encoding step (which can be done using the code snipped above).
Read a bit about URL encoding: http://en.wikipedia.org/wiki/Query_string#URL_encoding
That is because if you send space as parameter in url it is encoded as +. This happens because there are some rules which characters are valid in URL. See URL RFC.
It is necessary to encode any characters disallowed in a URL, including spaces and other binary data not in the allowed character set, using the standard convention of the "%" character followed by two hexadecimal digits.
If you want to have + as symbol in url you need to encode it into %2B. For example 2+2 is encoded as 2%2B2 and i am as i+am. So in your case I believe you have to correct result as AZrhQaTRSiys5GZtlwZ+H3qUyIY decodes into AZrhQaTRSiys5GZtlwZ H3qUyIY.

How to represent a string URL for special character?

I am newbie to this and couldn't find exact answer. I have special characters in a URL such as,
"&", "#", "?" "<"
it causes a problems. (If someone can suggest how to deal with such situation then it would be an additional help). My main problem is that, how can I represent a string literal in JAVA for following kind of URL ?
"x###y"
I learned that we need to put its hex code value (using %). Can someone suggest that exact answer to fix this URL problem ?
You'll need to URL encode the address.
See :-
http://download.oracle.com/javase/1.5.0/docs/api/java/net/URLEncoder.html
java.net.URLEncoder.encode(YOUR_STRING, "UTF-8");
See also this question for a way to only encode the part of the url you need:
The answer depends on where the data is in the URL. There will be different encoding rules for different parts of the URL.
The exact form may also depend on what URI format the server is expecting.
Parameters in the query part can usually be encoded as application/x-www-form-urlencoded using the URLEncoder:
String query = URLEncoder.encode("key1", "UTF-8")
+ "="
+ URLEncoder.encode("value1", "UTF-8")
+ "&"
+ URLEncoder.encode("key2", "UTF-8")
+ "="
+ URLEncoder.encode("value2", "UTF-8");
If you need to encode in other parts of the URI (the path part, or the fragment part) read this.
URLEncoder is not for encoding URLs it is there to encode form data
see the following link for more details
HTTP URL Address Encoding in Java

Categories