I develop a facebook application using flex' s XMLSocket and Java.
When i type 'ş' character in my client side, it prints, however when i send 'ş' character,
it is printed as ??? or any kind of unpredictable characters.
I tried to change my html file's meta tag to
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
but it did not work.
On the whole how can i get rid of this problem.
Thanks.
Use encodeURIComponent(yourstring), this might do the trick.
Related
I have mathematical symbols e.g. alfa, beta,mu . When I copy these symbols in text area they are getting copied. I am copying them from word document. When I insert them into the database using prepared statement the symbols are getting inserted as code. for example the alfa is getting stored as β. This is fine I guess. But when I retrieve them from the database using java.sq.Statement and displaying them in the html page they are getting displayed as code instead of symbol. I mean "β" is displayed in html instead displaying alfa symbol. So how to deal with this situation? how can I store symbols and display them properly in html?
I am using mysql database, java1.7,struts2.0 and tomcat7.
The correct display of HTML characters is: β (Looks like: β) You need to add a semicolon.
1) How are you displaying the codes in HTML?
2) What is the char encoding of machine your are running your server/viewing your html
I had following code and it worked
<html>
<body>
This is alpha α<br/>
This is beta β <br/>
This is gamma Γ <br/>
<body>
</html> as shown below:
This is alpha α
This is beta β
This is gamma Γ
You may need to declare your charset:
<meta http-equiv="content-type" content="text/html;charset=utf-8" />
or see the encoding of your server (if its in jsp)
The following tag of struts helped me solving this to an extent.
<s:property value="name" escape="false" />
I hope you're using JSPs. Add this import on top of your JSP which is rendering the symbols:
<%# page contentType="text/html;charset=UTF-8" pageEncoding="UTF-8"%>
I have read all of the Java URL encoding threads on here but still haven't found a solution to my problem: Google Chrome encodes "BŒUF" to "B%8CUF" POST data, awesome. How can I convince Java to do the same? (The website is <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="fr"> and <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"> in case this is important.)
System.out.println(URLEncoder.encode("BŒUF", "utf-8"));
System.out.println(URLEncoder.encode("BŒUF", "iso-8859-1"));
System.out.println(URLEncoder.encode("BŒUF", "iso-8859-15"));
System.out.println(new URI("http","www.google.com","/ig/api","BŒUF", null).toASCIIString());
prints
B%C5%92UF
B%3FUF
B%BCUF
http://www.google.com/ig/api?B%C5%92UF
but not "B%8CUF"?
You are specifically looking for windows-1252 encoding not UTF-8:
System.out.println(URLEncoder.encode("BŒUF", "windows-1252"));
Gives,
B%8CUF
I've got the following error sometimes when I'm try to parse a XML file with Java (within GAE server):
Parse: org.xml.sax.SAXParseException; lineNumber: 10; columnNumber: 3; The element type "META" must be terminated by the matching end-tag "</META>".
Yet it is not happening all the time, sometimes It's works alright. The program parsing xml files and I've no problem with them.
This is the XML file I'm trying to parse:
http://www.fulhamchronicle.co.uk/london-chelsea-fc/rss.xml
Any help will be appreciated. Thanks.
Update:
Thanks for the answer. I changed my code to a different parser and the good news the file is now parsing correctly.
The bad it now moved for another feed same problem, same line despite completely different feed and it worked perfectly before. Could anyone think of why it's happening?
That looks like it is a live document; i.e. one that changes fairly frequently. There is also no sign of a <meta> tag in it.
I can think of two explanations for what is happening:
Sometimes the document is being generated or created incorrectly.
Sometimes you are getting an HTML error page instead of the document you are expecting, and the XML parser can't cope with a <meta> tag in the HTML's <head>. That is because the <meta> tag in (valid) HTML does not need to have a matching / closing </meta> tag. (And for at least some versions of HTML, it is not allowed to have a closing tag.)
To track this down, you are going to have to capture the precise input that is causing the parse to fail.
There are two solutions:
You can try <meta/> instead of <meta>.
Add spring.thymeleaf.mode=LEGACYHTML5 in your application.properties file.
and added this dependency in you pom.xml or build.gradle file.
pom.xml:
<dependency>
<groupId>net.sourceforge.nekohtml</groupId>
<artifactId>nekohtml</artifactId>
<version>1.9.21</version>
</dependency>
gradle:
compile 'net.sourceforge.nekohtml:nekohtml:1.9.21'
just apply (/) after every line with meta
<meta name=" " content=" " />
when using ,
<meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>
and really it works
It is not XML but HTML:
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/1999/REC-html401-19991224/strict.dtd">
The XML parser will not parse it.
I see the file hasn't any content and it doesn't look like valid RSS file. May be any server-side error occurs.
can you use this tag
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
I'm creating a webapp using Spring MVC and some of the information I'm pulling is from a Database, so it was edited elsewhere. When I import some have, what I consider, special characters, such as
“_blank”
as opposed to using the standard keyboard
"_blank".
When I display this on my website textarea, it displays fine, but when I attempt to save it back into the string when submitting the form in the spring textArea, the string now has ? where the 'special' characters were. They were obviously imported into a String fine, but somewhere in the save process it's not allowing it as a special character. Any idea what is causing this or why?
Sounds like a character encoding problem. Try setting the character set of the page containing the form to UTF-8.
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
I have a tomcat based application that needs to submit a form capable of handling utf-8 characters. When submitted via ajax, the data is returned correctly from getParameter() in utf-8. When submitting via form post, the data is returned from getParameter() in iso-8859-1.
I used fiddler, and have determined the only difference in the requests, is that charset=utf-8 is appended to the end of the Content-Type header in the ajax call (as expected, since I send the content type explicitly).
ContentType from ajax:
"application/x-www-form-urlencoded; charset=utf-8"
ContentType from form:
"application/x-www-form-urlencoded"
I have the following settings:
ajax post (outputs chars correctly):
$.ajax( {
type : "POST",
url : "blah",
async : false,
contentType: "application/x-www-form-urlencoded; charset=utf-8",
data : data,
success : function(data) {
}
});
form post (outputs chars in iso)
<form id="leadform" enctype="application/x-www-form-urlencoded; charset=utf-8" method="post" accept-charset="utf-8" action="{//app/path}">
xml declaration:
<?xml version="1.0" encoding="utf-8"?>
Doctype:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
meta tag:
<meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>
jvm parameters:
-Dfile.encoding=UTF-8
I have also tried using request.setCharacterEncoding("UTF-8"); but it seems as if tomcat simply ignores it. I am not using the RequestDumper valve.
From what I've read, POST data encoding is mostly dependent on the page encoding where the form is. As far as I can tell, my page is correctly encoded in utf-8.
The sample JSP from this page works correctly. It simply uses setCharacterEncoding("UTF-8"); and echos the data you post. http://wiki.apache.org/tomcat/FAQ/CharacterEncoding
So to summarize, the post request does not send the charset as being utf-8, despite the page being in utf-8, the form parameters specifying utf-8, the xml declaration or anything else. I have spent the better part of three days on this and am running out of ideas. Can anyone help me?
form post (outputs chars in iso)
<form id="leadform" enctype="application/x-www-form-urlencoded; charset=utf-8" method="post" accept-charset="utf-8" action="{//app/path}">
You don't need to specify the charset there. The browser will use the charset which is specified in HTTP
response header.
Just
<form id="leadform" method="post" action="{//app/path}">
is enough.
xml declaration:
<?xml version="1.0" encoding="utf-8"?>
Irrelevant. It's only relevant for XML parsers. Webbrowsers doesn't parse text/html as XML. This is only relevant for the server side (if you're using a XML based view technology like Facelets or JSPX, on plain JSP this is superfluous).
Doctype:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
Irrelevant. It's only relevant for HTML parsers. Besides, it doesn't specify any charset. Instead, the one in the HTTP response header will be used. If you aren't using a XML based view technology like Facelets or JSPX, this can be as good <!DOCTYPE html>.
meta tag:
<meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>
Irrelevant. It's only relevant when the HTML page is been viewed from local disk or is to be parsed locally. Instead, the one in the HTTP response header will be used.
jvm parameters:
-Dfile.encoding=UTF-8
Irrelevant. It's only relevant to Sun/Oracle(!) JVM to parse the source files.
I have also tried using request.setCharacterEncoding("UTF-8"); but it seems as if tomcat simply ignores it. I am not using the RequestDumper valve.
This will only work when the request body is not been parsed yet (i.e. you haven't called getParameter() and so on beforehand). You need to call this as early as possible. A Filter is a perfect place for this. Otherwise it will be ignored.
From what I've read, POST data encoding is mostly dependent on the page encoding where the form is. As far as I can tell, my page is correctly encoded in utf-8.
It's dependent on the HTTP response header.
All you need to do are the following three things:
Add the following to top of your JSP:
<%#page pageEncoding="UTF-8" %>
This will set the response encoding to UTF-8 and set the response header to UTF-8.
Create a Filter which does the following in doFilter() method:
if (request.getCharacterEncoding() == null) {
request.setCharacterEncoding("UTF-8");
}
chain.doFilter(request, response);
This will make that the POST request body will be processed as UTF-8.
Change the <Connector> entry in Tomcat/conf/server.xml as follows:
<Connector (...) URIEncoding="UTF-8" />
This will make that the GET query strings will be processed as UTF-8.
See also:
Unicode - How to get characters right? - contains practical background information and detailed solutions for Java EE web developers.
Try this :
How do I change how POST parameters are interpreted?
POST requests should specify the encoding of the parameters and values they send. Since many clients fail to set an explicit encoding, the default is used (ISO-8859-1). In many cases this is not the preferred interpretation so one can employ a javax.servlet.Filter to set request encodings. Writing such a filter is trivial. Furthermore Tomcat already comes with such an example filter.
Please take a look at:
5.x
webapps/servlets-examples/WEB-INF/classes/filters/SetCharacterEncodingFilter.java
webapps/jsp-examples/WEB-INF/classes/filters/SetCharacterEncodingFilter.java
6.x
webapps/examples/WEB-INF/classes/filters/SetCharacterEncodingFilter.java
For more info , refer to the below URL
http://wiki.apache.org/tomcat/FAQ/CharacterEncoding
Have you tried accept-charset="UTF-8"? As you said, the data should be encoded according to the encoding of the page itself; it seems strange that tomcat is ignoring that. What browser are you trying this out on?
Have you tried to specify useBodyEncodingForURL="true" in your conf/server.xml for HTTP connector?
I implemented a filter based on the information in this post and it is now working. However, this still doesn't explain why even though the page was UTF-8, the charset used by tomcat to interpret it was ISO-9951-1.