Issue in MetaKeyword,MetaDescription information in JSP using Java - java

<meta name="description" content="${metaDescription}" />
In case the user is in the french culture, When I view the page source
<meta name="description" content="Trouvez des pneus fiables et s�curitaires pour votre auto, VUS ou camionnette. Canadian Tire offre un grand choix de pneus d'hiver, toute saison et performants"/>
In place of ?, It should be é
I tried to put equivalent UTF-8 code for é. I got the same UTF-8 code in view page source.
Does anyone know what I've done wrong?

This normally indicates that you are looking at a UTF-8 encoded document using ASCII decoding. You might be missing he correct content type definition in your html file, try adding
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
to the <head> in the HTML document.
Hope that helps.

You need to set the JSP page encoding to the desired charset. Add the following to the top:
<%#page pageEncoding="UTF-8" %>
This will do two things:
It tells the server that it should treat the characters in JSP as UTF-8 by response.setCharacterEncoding("UTF-8").
It tells the browser that it should interpret the characters from the server as UTF-8 by response.setContentType("text/html;charset=UTF-8").
See also:
Unicode - How to get the characters right?

Related

Saving Chinese characters using Java HtmlEditorKit

I'm trying to save HtmlDocument(saved with UTF-8 encoding) which contains Chinese character 𠜎 using HtmlEditorKit in the following way:
try (OutputStreamWriter f = new OutputStreamWriter(fileOutputStream, "UTF-8")) {
    htmlEditorKit.write(f, htmlDocument, 0, htmlDocument.getLength());
} catch (BadLocationException e) {
    logger.error("Could not save", e);
}
In output HTML doc I'm getting two 2 bytes characters(amp#55361;amp#57102;) instead of one 4 bytes character. Java can understand which symbol is it by combining both of them, but HTML can't.
Any suggestion on how to save it, so HTML page could be correctly displayed?
Here is output html:
<html>
<head>
<meta content="text/html" charset="utf-8">
</head>
<body>
<p>𠜎</p>
</body>
</html>

Chinese character gets scrambled when going from JSP to server in Java

I have already set
<%#page pageEncoding="UTF-8" contentType="text/html; charset=UTF-8" %>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
This in my JSP. But, after doing as
xmlHttp.setRequestHeader("SEARCH_TEXT", srctxt);
or
passing as a parameter in the AJAX url,
I am still getting Chinese words as scrambled letters or '????' marks.
Required some insight regarding this. Please help.
#Mena, After your comment, I checked the 'encodeURIComponent' and as I encoded the Chinese string and decoded it my server side code, it got resolved. Thanx. Pasting code for reference,
Client Side code,
xmlHttp.setRequestHeader("SEARCH_TEXT", encodeURIComponent(srctxt));
Server Side Code,
CommonUtils.decodedStringValue(request.getHeader("SEARCH_TEXT"));
Hope this helps.

JSP data to be downloaded to Excel sheet using ActiveQuery results in character problems

downloading data using Active query from JSP page with some parameters is leading to character problems. Special characters in the german language as for example, ö, ä, ß are printed as ö, ä and ß.
Debugging the JSP page in Java shows that the result that is returned by the JSP page is correct. So the problem seems to be due to conversion within excel after download, most probably due to a unsopported charset.
I tried to convert the result string in JSP to different charsets, but the problem still persists.
Does anyone know a solution?
Thank You very much in advance!
Did you try setting the encoding of the page?
<%# page contentType="text/html; charset=UTF-8" pageEncoding="UTF8" %>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
</head>
...
If you can't find a solution on the Microsoft side, I'd recommend this alternative here:
http://poi.apache.org/

Java check - charset, encoding of html page - like browsers do

How to check what really charset, encoding of some html page ?
For example, the charset of some html page is iso-8859-1, but the content of the html written with utf8
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
...
here is content with utf8
...
</html>
How to check it, Is it possible to check with java charset, encoding of html page,
like it's done in browsers ?
Thank you !

Dividing an one-line HTML file to well-formed HTML file

I have an HTML file in which all tags are in one line. I would like to separate each tag and put it on its own line. The end goal is to have a well-formed HTML file.
e.g.
<html><head><title>StackOverflow</title></head><body></body></html>
would be converted into:
<html>
<head>
<title>
StackOverflow
</title>
</head>
<body>
</body>
</html>
Is there an existing Java library that handles this already?
Your problem has nothing to do with well-formed HTML files. Even if html tags are on the same line, doesn't mean that the html is not well formed.
What you actually neeed is just a formatter, which basically will make your html more human-readable.
You could take a look at JTidy, which can optionally do also a syntax checking.

Categories