String Encoding in java - java

I have a String String a="123+>jo I want to en
code the String so that I can redirect it to an url. I initially tried it with urlencoder but in urldecoder +(plus) is removed while decoding.So i lost my data.What is the right way to encode so that I get the same string while decoding also?

URLEncoder works perfectly. The plus sign is succesfully encoded into %2B.
Encoding: Works
Here is the IDEONE project:
http://ideone.com/zMDur
import java.net.URLEncoder;
// ...
public static void main (String[] args) throws java.lang.Exception
{
String str = "123+>jo";
String str2 = "http://1.com/23+>jo";
System.out.println(URLEncoder.encode(str));
System.out.println(URLEncoder.encode(str2));
}
prints:
123%2B%3Ejo
http%3A%2F%2F1.com%2F23%2B%3Ejo
Encoding + Decoding: Works
The IDEONE project with decoding as well: http://ideone.com/Ypfv4
import java.net.URLEncoder;
import java.net.URLDecoder;
// ...
public static void main (String[] args) throws java.lang.Exception
{
String str = "123+>jo";
String str2 = "http://1.com/23+>jo";
System.out.println(URLDecoder.decode(URLEncoder.encode(str)));
System.out.println(URLDecoder.decode(URLEncoder.encode(str2)));
}
Prints:
123+>jo
http://1.com/23+>jo
So everything works using the java.net.URLEncoder and java.net.URLDecoder.

Related

How handle URL Encoded Characters in Jsoup

How to handle URL Encoded Characters like colon (%3A) in JSoup connect function?
What you could basically do is encode the URL before you use it in JSOUP.
I believe what you are trying to do here is pass some parameters to the host in the URL itself.
To encode the URL, use the below code:
String url = "https://google.com?q=i wish to search something";
String encodeURL=URLEncoder.encode( url, "UTF8" );
Here's the answer to your comment:
package com.abk;
import java.io.IOException;
import java.net.URLDecoder;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
public class JsoupTest{
public static void main( String[] args ) throws IOException{
Document doc = Jsoup.connect(URLDecoder.decode("https://siccode.com/en/business-list/sic%3A2211%22","UTF8")).get();
String title = doc.title();
System.out.println("title is: " + title);
}
}
This should work like a charm :)
Use
String decodedString1 = URLDecoder.decode("siccode.com/en/business-list/sic%3A2211", "UTF-8");
as its url encoded you need to decode it before using.
Sample for JS.
var str = decodeURIComponent("siccode.com/en/business-list/sic%3A2211");
console.log(str);

Apache commons CSV: quoted input doesn't work

import java.io.IOException;
import java.io.StringReader;
import org.apache.commons.csv.CSVFormat;
import org.apache.commons.csv.CSVParser;
I try to parse a simple csv file with Apache CSV parser. It works fine as long as I don't use quotes. When I try to add a quote to the input
"a";42
it gives me the error:
invalid char between encapsulated token and delimiter
Here is a simple, complete code:
public class Test {
public static void main(String[] args) throws IOException {
String DATA = "\"a\";12";
CSVParser csvParser =
CSVFormat.EXCEL
.withIgnoreEmptyLines()
.withIgnoreHeaderCase()
.withRecordSeparator('\n').withQuote('"')
.withEscape('\\').withRecordSeparator(';').withTrim()
.parse(new StringReader(DATA));
}
}
I simply can't find out what I've missed in the code.
The problem was so trivial I missed it.
I used withRecordSeparator instead of withDelimiter to set the field separator.
This works as I expected:
public class Test {
public static void main(String[] args) throws IOException {
String DATA = "\"a\";12";
CSVParser csvParser =
CSVFormat.EXCEL
.withIgnoreEmptyLines()
.withIgnoreHeaderCase()
.withRecordSeparator('\n').withQuote('"')
.withEscape('\\').withDelimeter(';').withTrim()
.parse(new StringReader(DATA));
}
}

WebDriver getCurrentUrl() returning malformed URI

I'm involved in writing a (Java/Groovy) browser-automation app with Selenium 2 and FireFox driver.
Currently there is an issue with some URLs we find in the wild that are apparently using bad URI syntax. (specifically curly braces ({}), |'s and ^'s).
String url = driver.getCurrentUrl(); // http://example.com/foo?key=val|with^bad{char}acters
When trying to construct a java.net.URI from the string returned by driver.getCurrentUrl() a URISyntaxException is thrown.
new URI(url); // java.net.URISyntaxException: Illegal character in query at index ...
Encoding the whole url before constructing the URI will not work (as I understand it).
The whole url is encoded, and it doesn't preseve any pieces of it that I can parse in any normal fashion. For example, with this uri-safe string, URI can't know the difference between a & as the query-string-param delimeter or %26 (its encoded value) in the content of a single qs-param.
String encoded = URLEncoder.encode(url, "UTF-8") // http%3A%2F%2Fexample.com%2Ffoo%3Fkey%3Dval%7Cwith%5E%7Cbad%7Ccharacters
URI uri = new URI(encoded)
URLEncodedUtils.parse(uri, "UTF-8") // []
Currently the solution is, before constructing the URI, running the following (groovy) code:
["|", "^", "{", "}"].each {
url = url.replace(it, URLEncoder.encode(it, "UTF-8"))
}
But this seems dirty and wrong.
I guess my question is multi-part:
Why does FirefoxDriver return a String rather than a URI?
Why is this String malformed?
What is best practice for dealing with this kind of thing?
We can partially encode query string parameters, as discussed in comments, it should work.
Other way is to use galimatias library:
import io.mola.galimatias.GalimatiasParseException;
import io.mola.galimatias.URL;
import java.net.URI;
import java.net.URISyntaxException;
public class Main {
public static void main(String[] args) throws URISyntaxException {
String example1 = "http://example.com/foo?key=val-with-a-|-in-it";
String example2 = "http://example.com?foo={bar}";
try {
URL url1 = URL.parse(example1);
URI uri1 = url1.toJavaURI();
System.out.println(url1);
System.out.println(uri1);
URL url2 = URL.parse(example2);
URI uri2 = url2.toJavaURI();
System.out.println(url2);
System.out.println(uri2);
} catch (GalimatiasParseException ex) {
// Do something with non-recoverable parsing error
}
}
}
Output:
http://example.com/foo?key=val-with-a-|-in-it
http://example.com/foo?key=val-with-a-%7C-in-it
http://example.com/?foo={bar}
http://example.com/?foo=%7Bbar%7D
driver.getCurrentUrl() gets a string from the browser and before making it into an URL, you should URL encode the string.
See Java URL encoding of query string parameters for an example of this in Java.
Will this work for you?
import java.net.URI;
import java.net.URL;
import java.net.URLEncoder;
public class Sample {
public static void main(String[] args) throws UnsupportedEncodingException {
String urlInString="http://example.com/foo?key=val-with-a-{-in-it";
String encodedURL=URLEncoder.encode(urlInString, "UTF-8");
URI encodedURI=URI.create(encodedURL);
System.out.println("Actual URL:"+urlInString);
System.out.println("Encoded URL:"+encodedURL);
System.out.println("Encoded URI:"+encodedURI);
}
}
Output:
Actual URL:http://example.com/foo?key=val-with-a-{-in-it
Encoded URL:http%3A%2F%2Fexample.com%2Ffoo%3Fkey%3Dval-with-a-%7B-in-it
Encoded URI:http%3A%2F%2Fexample.com%2Ffoo%3Fkey%3Dval-with-a-%7B-in-it
Another Solution is to split the URL fetched and then use them to create the URL you want. This will ensure that you get all the features of URL class.
import java.io.UnsupportedEncodingException;
import java.net.MalformedURLException;
import java.net.URI;
import java.net.URISyntaxException;
import java.net.URL;
public class Sample {
public static void main(String[] args) throws UnsupportedEncodingException,
URISyntaxException, MalformedURLException {
String uri1 = "http://example.com/foo?key=val-with-a-{-in-it";
String scheme=uri1.split(":")[0];
String authority=uri1.split("//")[1].split("/")[0];
String path=uri1.split("//")[1].split("/")[1].split("\\?")[0];
String query=uri1.split("\\?")[1];
URI uri = null;
uri = new URI(scheme, authority, "/"+path, query,null);
URL url = null;
url = uri.toURL();
System.out.println("URI's Query:"+uri.getQuery());
System.out.println("URL's Query:"+url.getQuery());
}
}

Find the second duplicate word

import java.io.BufferedReader;
import java.io.File;
import java.io.FileReader;
import java.io.IOException;
public class Example {
public static void main(String[] args) throws IOException
{
File fis=new File("D:/Testcode/Test.txt");
BufferedReader br;
String input;
String var = null;
if(fis.isAbsolute())
{
br=new BufferedReader(new FileReader(fis.getAbsolutePath()));
while ((input=br.readLine())!=null) {
var=input;
}
}
//String var="Duminy to Warner, OUT, Duminy gets a wicket again. He has been breaking...
if(var!=null)
{
String splitstr[]=var.split(",");
if(splitstr[0].contains("to"))
{
String ss=splitstr[0];
String a[]=ss.split("\\s+");
int value=splitstr[0].indexOf("to");
System.out.println("Subject:"+splitstr[0].substring(0,value));
System.out.println("Object:"+splitstr[0].substring(value+2));
System.out.println("Event:"+splitstr[1]);
int count=var.indexOf(splitstr[2]);
System.out.println("Narrated Information:"+var.substring(count));
}
}
}
}
The above program shown the following output:
Subject:Duminy
Object: Warner
Event: OUT
Narrated Information: Duminy gets a wicket again. He has been breaking....
my question is, the text may contain, For example: "Dumto to Warner, OUT, Duminy gets a wicket again. He has been breaking..." means, the above program wouldn't show output like above.. how to identity the text after the space for checking the condition
Instead of:
if(splitstr[0].contains("to")
Change it to:
if(splitstr[0].contains(" to ")
It should then work fine IMO.

regarding encoded strings over Http(Apache)

I have a question regarding UTF-8 encoding when sending strings containing special characters using HttpServiceClient (Apache)
I have this small piece of code below where the method takes string and sends it via Http(which is not fully complete in the code).
Although the decoded string seems to work without problems, I would like to know if the method.addparameter or httpClient.execute(method) encodes the string again. We have the problem that at the client side the strings seem doubly encoded!
eg. strReq = äöü
import org.apache.commons.codec.DecoderException;
import org.apache.commons.codec.EncoderException;
import org.apache.commons.codec.net.URLCodec;
import org.apache.commons.httpclient.methods.PostMethod;
public class Demo {
public static void Test(String strReq) throws CancellationException, IOException, DecoderException {
PostMethod method = null;
method = new PostMethod("www.example.com");
// Encode the XML document.
URLCodec codec = new URLCodec();
String requestEncoded = new String(strReq);
try {
requestEncoded = codec.encode(strReq);
} catch (EncoderException e) {
}
System.out.println("encoded req = "+requestEncoded);
method.addParameter(Constants.Hdr, requestEncoded);
String str2 = codec.decode(requestEncoded);
System.out.println("str2 ="+str2);
}

Categories