How to escape the NÃO while reading the URL - java

InputStream in = address.openStream();
URL url = new URL("://www.mydomain.com/?param1=NÃO&param2=NÃO");
HttpURLConnection urlConnection = (HttpURLConnection) url.openConnection();
BufferedReader reader = new BufferedReader(new InputStreamReader(in));
StringBuilder result = new StringBuilder();
String line;
while((line = reader.readLine()) != null) {
result.append(line);
}
System.out.println(result.toString());
But when i am trying to put the result into StringBuilder the NÃO Special character à is getting escaped
How to bring it with out losing the char set value ?

I believe you want to use URLEncoder.encode(String, String) to encode your parameter like
try {
String value = URLEncoder.encode("NÃO", "utf-8");
String url = "://www.mydomain.com/?param1=" + value + "&param2="
+ value;
System.out.println(url);
} catch (UnsupportedEncodingException e) {
e.printStackTrace();
}
Output is
://www.mydomain.com/?param1=N%C3%83O&param2=N%C3%83O

Related

dealing with korean text breaking words (like ???)

I'm using api to get xml.
but English text is okay to get xml
and also number text is okay
however korean text can't get
this is my code
StringBuffer result = new StringBuffer();
try {
String urlstr = "https://openapi.gg.go.kr/OrganicAnimalProtectionFacilit?" +
"KEY=secret" +
"&Type=xml" +
"&pIndex=1"+
"&pSize=100";
URL url = new URL(urlstr);
HttpURLConnection urlconnection = (HttpURLConnection) url.openConnection();
urlconnection.setRequestMethod("GET");
BufferedReader br = new BufferedReader(new InputStreamReader(urlconnection.getInputStream(), StandardCharsets.UTF_8 ));
String returnLine;
result.append("<xmp>");
while((returnLine = br.readLine())!=null) {
result.append(returnLine+"\n");
}
urlconnection.disconnect();
}catch(Exception e) {
e.printStackTrace();
}
return result+"</xmp>";

Decode google translate API response in JAVA

I need to write a small tool in JAVA which will translate text from English to French using the Google translate API. Everything works but I have an apostrophe decoding problem.
Original text:
Inherit Tax Rate
Text translated with Google translate API:
Taux d' imposition hérité
How it should be:
Taux d'imposition hérité
This is my translate method(sorry for the long method):
private String translate(String text, String from, String to) {
StringBuilder result = new StringBuilder();
try {
String encodedText = URLEncoder.encode(text, "UTF-8");
String urlStr = "https://www.googleapis.com/language/translate/v2?key=" + sKey + "&q=" + encodedText + "&target=" + to + "&source=" + from;
URL url = new URL(urlStr);
HttpsURLConnection conn = (HttpsURLConnection) url.openConnection();
InputStream googleStream;
if (conn.getResponseCode() == 200) {
googleStream = conn.getInputStream(); //success
} else
googleStream = conn.getErrorStream();
BufferedReader reader = new BufferedReader(new InputStreamReader(googleStream));
String line;
while ((line = reader.readLine()) != null) {
result.append(line);
}
JsonParser parser = new JsonParser();
JsonElement element = parser.parse(result.toString());
if (element.isJsonObject()) {
JsonObject obj = element.getAsJsonObject();
if (obj.get("error") == null) {
String translatedText = obj.get("data").getAsJsonObject().
get("translations").getAsJsonArray().
get(0).getAsJsonObject().
get("translatedText").getAsString();
return translatedText;
}
}
if (conn.getResponseCode() != 200) {
System.err.println(result);
}
} catch (IOException | JsonSyntaxException ex) {
System.err.println(ex.getMessage());
}
return null;
}
I'm using an XML writer to write the text and first I though that this has a problem, but I observed that the text is returned like this in the stream so I introduced the encoding parameter when I initialise the InputStreamReader:
BufferedReader reader = new BufferedReader(new InputStreamReader(googleStream, "UTF-8"));
But I receive the string with the same problem. Any ideas about what I can do?
I think this problem is solved by using the format parameter (docs). It defaults to html, but you can change it to text to receive unencoded data. Your request should look like this:
String urlStr = "https://www.googleapis.com/language/translate/v2?key=" + sKey + "&q=" + encodedText + "&target=" + to + "&source=" + from + "&format=text";

java servlet http url request

When I run this code from Java app I get correct response (UTF-8 encoded).
The problem is, when I run it from my servlet, I'm geting:
"פשטות הי� התחכו� המושל�"
ל×�×•× ×¨×“×• די סר פיירו דה ×•×™× ×¦'×™
Any idea how to fix it?
URL url;
HttpURLConnection conn;
BufferedReader rd;
String line;
String result = "";
try {
url=new URL("http://www.walla.co.il");
conn = (HttpURLConnection) url.openConnection();
conn.setRequestMethod("GET");
rd = new BufferedReader(new InputStreamReader(conn.getInputStream()));
StringBuffer sb = new StringBuffer("");
String s1="";
String NL = System.getProperty("line.separator");
while ((s1 = rd.readLine()) != null)
sb.append(s1+NL);
System.out.println(sb);
rd.close();
return sb.toString();
} catch (IOException e) {
e.printStackTrace();
} catch (Exception e) {
e.printStackTrace();
}
return "";
set "JAVA_OPTS=%JAVA_OPTS% -Dfile.encoding=UTF8"
i run this from *.bat file in my tomcat\bin
and it fix the problem seems like i had to set the encode for tomcat/jvm
not 100% sure but it works now :)

UTF-8 response with servlet

I am reading HTTP response from a Perl page in a Servlet like this:
public String getHTML(String urlToRead) {
URL url;
HttpURLConnection conn;
BufferedReader rd;
String line;
String result = "";
try {
url = new URL(urlToRead);
conn = (HttpURLConnection) url.openConnection();
conn.setRequestMethod("GET");
conn.setRequestProperty("Accept-Charset", "UTF-8");
conn.setRequestProperty("Content-Type", "text/xml; charset=UTF-8");
rd = new BufferedReader(new InputStreamReader(conn.getInputStream(), "UTF-8"));
while ((line = rd.readLine()) != null) {
byte [] b = line.getBytes();
result += new String(b, "UTF-8");
}
rd.close();
} catch (Exception e) {
e.printStackTrace();
}
return result;
}
I am displaying this result with this code:
response.setContentType("text/plain; charset=UTF-8");
PrintWriter out = new PrintWriter(new OutputStreamWriter(response.getOutputStream(), "UTF-8"), true);
try {
String query = request.getParameter("query");
String type = request.getParameter("type");
String res = getHTML(url);
out.write(res);
} finally {
out.close();
}
But the response still is not encoded as UTF-8. What am I doing wrong?
Thanks in advance.
That call to line.getBytes() looks suspicious. You should probably make it line.getBytes("UTF-8") if you are certain that what is returned is UTF-8 encoded. Additionally, I'm not sure why it is even necessary. A typical approach to getting data out of a BufferedReader is to use a StringBuilder to continue appending each String retrieved from readLine into a result. The conversion back and forth between String and byte[] is unnecessary.
Change result into a StringBuilder and do this:
while ((line = rd.readLine()) != null) {
result.append(line);
}
Here is where you break the chain of character encoding conversions:
while ((line = rd.readLine()) != null) {
byte [] b = line.getBytes(); // NOT UTF-8
result += new String(b, "UTF-8");
}
From String#getBytes() javadoc:
Encodes this String into a sequence of bytes using the platform's
default charset, storing the result into a new byte array
And, defaullt charset is probably not UTF-8.
But why do all the conversions in the first place? Just read the raw bytes from the source and write the raw bytes to the consumer. It's supposed to be UTF-8 all the way.
I also faced the same problem in another scenario, but just do it I believe it will work:
byte[] b = line.getBytes(UTF8_CHARSET);
in the while loop:
while ((line = rd.readLine()) != null) {
byte [] b = line.getBytes(); // NOT UTF-8
result += new String(b, "UTF-8");
}
In my case, I have do add another configuration.
Previously, I was writing the page this way:
try (PrintStream printStream = new PrintStream(response.getOutputStream()) {
printStream.print(pageInjecting);
}
I changed to:
try (PrintStream printStream = new PrintStream(response.getOutputStream(), false, "UTF-8")) {
printStream.print(pageInjecting);
}

Method to download website source returns nothing

I created method to download any url's source and show it in textview called checkView but when I call it with button it returns me empty textview instead of string with website code:
void getWebsite(String search) {
String res = null;
try {
StringBuffer sb = new StringBuffer("");
String line = "";
URL url = new URL("http://drinkify.org" + search);
URLConnection conn = url.openConnection();
BufferedReader rd = new BufferedReader(new InputStreamReader(
conn.getInputStream()));
String NL = System.getProperty("line.separator");
while ((line = rd.readLine()) != null) {
sb.append(line + NL);
res = sb.toString();
}
} catch (Exception e) {
}
checkView.setText(res);
}
Any thoughts?
First of all, add a log-statement or a breakpoint to see if the text is actually downloaded.
My guess is that you get an exception (missing INTERNET-permission in the manifest?) that gets swallowed in your catch (Exception e), add a breakpoint within the catch clause to test it.

Categories