After a lot of trial and error I still can't figure out the problem. The JSP, servlet, and database are all set to accept UTF-8 encoding, but even still whenever I use request.getParameter on anything that has any two-byte characters like the em dash they get scrambled up as broken characters.
I've made manual submissions to the database and it's able to accept these characters, no problem. And if I pull the text from the database in a servlet and print it in my jsp page's form it displays no problem.
The only time I've found that it comes back as broken characters is when I try and display it elsewhere after retrieving it using request.getParameter.
Has anyone else had this problem? How can I fix it?
That can happen if request and/or response encoding isn't properly set at all.
For GET requests, you need to configure it at the servletcontainer level. It's unclear which one you're using, but for in example Tomcat that's to be done by URIEncoding attribute in <Connector> element in its /conf/server.xml.
<Connector ... URIEncoding="UTF-8">
For POST requests, you need to create a filter which is mapped on the desired URL pattern covering all those POST requests. E.g. *.jsp or even /*. Do the following job in doFilter():
request.setCharacterEncoding("UTF-8");
chain.doFilter(request, response);
For HTML responses and client side encoding of submitted HTML form input values, you need to set the JSP page encoding. Add this to top of the JSP (you've probably already done it properly given the fact that displaying UTF-8 straight form DB works fine).
<%#page pageEncoding="UTF-8" %>
Or to prevent copypasting this over every single JSP, configure it once in web.xml:
<jsp-config>
<jsp-property-group>
<url-pattern>*.jsp</url-pattern>
<page-encoding>UTF-8</page-encoding>
</jsp-property-group>
</jsp-config>
For source code files and stdout (IDE console), you need to set the IDE workspace encoding. It's unclear which one you're using, but for in example Eclipse that's to be done by setting Window > Preferences > General > Workspace > Text File Encoding to UTF-8.
Do note that HTML <meta http-equiv> tags are ignored when page is served over HTTP. It's only considered when page is opened from local disk file system via file://. Also specifying <form accept-charset> is unnecessary as it already defaults to response encoding used during serving the HTML page with the form. See also W3 HTML specification.
See also:
Unicode - How to get the characters right?
Why does POST not honor charset, but an AJAX request does? tomcat 6
HTML : Form does not send UTF-8 format inputs
Unicode characters in servlet application are shown as question marks
Bad UTF-8 encoding when writing to database (reading is OK)
BalusC's answer is correct but I just want to add it is important (for POST method of course) that
request.setCharacterEncoding("UTF-8");
is called before you read any parameter. This is how reading parameter is implemented:
#Override
public String getParameter(String name) {
if (!parametersParsed) {
parseParameters();
}
return coyoteRequest.getParameters().getParameter(name);
}
As you can see there is a flag parametersParsed that is set when you read any parameter for the first time, parseParameters() method with parse all the request's parameters and set the encoding.
Calling:
request.setCharacterEncoding("UTF-8");
after the parameters were parsed will have no effect! That is why some people are complaining that setting the request's encoding is not working.
Most answers here suggest to use servlet filter and set the character encoding there. This is correct but also be aware that some security libraries can read request parameters before your filter (this was my case) so if your filter is executed after that the character encoding of request parameters are already set and setting UTF-8 or any other will have no effect.
The Tomcat FAQ covers this topic pretty well. Particularly:
http://wiki.apache.org/tomcat/FAQ/CharacterEncoding#Q8
and http://wiki.apache.org/tomcat/FAQ/CharacterEncoding#Q4
The test JSP given in the FAQ is essentially the one I used when going through Tomcat years ago fixing various encoding issues.
Just want to add a point that in case anyone else made the same mistake as me where i overlooked POST method
Read all these solutions and applied to my code but it still didnt work because i forgot to add method="POST" in my <form> tag
Use a Filter as stated here: https://www.baeldung.com/tomcat-utf-8
P.S. If you are using JDK 8 (which doesn't have default methods) you can easily work it out defining empty methods "init" and "destroy:
package sample;
import javax.servlet.*;
import java.io.IOException;
public class CharacterSetFilter implements Filter {
public void doFilter(ServletRequest request, ServletResponse response,
FilterChain chain) throws IOException, ServletException {
request.setCharacterEncoding("UTF-8");
response.setCharacterEncoding("UTF-8");
chain.doFilter(request, response);
}
public void init(FilterConfig filterConfig) throws ServletException {
}
public void destroy() {
}
}
then, in web.xml:
<filter>
<filter-name>CharacterSetFilter</filter-name>
<filter-class>sample.CharacterSetFilter</filter-class>
</filter>
<filter-mapping>
<filter-name>CharacterSetFilter</filter-name>
<url-pattern>/*</url-pattern>
</filter-mapping>
Related
I have a Spring MVC project where I am using controller advice to handle errors thrown in controllers. However, I also want to display a nice error page if an error occurs within JSP files (even though this really shouldn't happen!). Therefore I have added the following to my project's web.xml file:
<error-page>
<error-code>500</error-code>
<location>/WEB-INF/views/application/error/view-error.jsp</location>
</error-page>
<error-page>
<exception-type>java.lang.Exception</exception-type>
<location>/WEB-INF/views/application/error/view-error.jsp</location>
</error-page>
If I trigger an error in JSTL on purpose, the contents of view-error.jsp is rendered fine. However, the content is appended to the output of the JSP file in which the error occurred. For instance, if an error occurs within display-users.jsp at line 50, the result is that the output that was generated before the error occurred (line 1-50) is prepended the contents in view-error.jsp.
This is very undesirable as it generates a funky looking error page. And since I cannot tell where an exception will be thrown (if I could, I would fix the error), then what the user sees is very likely to look bad.
I guess it's because the output is already in the buffer, and may already have been sent to the client? Is there any way I can fix this, or perhaps an alternative approach? Thanks!
This is a problem with large JSP generating big HTML, with scriptlet java code intermixed everywhere. As soon as enough data have been writen, the server commits the headers (sends them to client) and send the beginning of the page. At that moment, you can no longer rollback anything to get back the data that has already been received (and possibly displayed) by the browser.
That's one of the reasons why scriplet are not recommended, and if you really need to put some intelligence it the JSP, it should be at the beginning of the page before anything is actually sent to browser. But ideally, everything should have been computed in advance in a servlet and prepared data put in request attributes. That way the JSP should only contain simple conditionnal or loop tags in addition to HTML output and request attributes rendition. All that with little risk to generate an exception.
Looks like the OutputStream of the HttpServletResponse is being written to before the enitre JSP finishes rendering.
This ideally should be controllable by "autoflush" property. https://tomcat.apache.org/tomcat-5.5-doc/jspapi/javax/servlet/jsp/JspWriter.html
But just in case it isn't solvable by that:
You could intercept anything that written to HttpServletResponse by using the HttpServletResponseWrapper approach.
The general idea there is that you create a Filter and that Filter will pass a "Response Wrapper" to the layers below. This Response Wrapper holds a reference to real Response instance. Anything that gets written to the Response, can be then manipulated by the Response Wrapper and then sent to the real Response instance.
So, for your case, you could append all the data in a StringBuilder, and when then controls returns back to the Filter, the Filter can print the entire StringBuilder to the real Response's OutputStream.
Here is an example that intercepts anything the Servlets, etc. write and then sends the GZipped version of that to the Browser:
http://tutorials.jenkov.com/java-servlets/gzip-servlet-filter.html
Been there, done that. Here's a quick and dirty workaround until you can redesign.
1) Place the all the JSTL code that generates output in a new JSP -- let's call it display-users-view.jsp (call it whatever you want).
2) Import display-users-view.jsp from your display-users.jsp page via a <c:import>, but make sure you dump the contents to a var(!). e.g.:
<c:import url="display-users-view.jsp" var="output"/>
3) As a final step in display-users.jsp, dump the output to the screen with a simple:
${output}
Now, if the error is thrown before the ${output} .. no harm, no foul because you haven't output anything to the browser yet. If there is no error, the ${output} will dump the HTML that was generated in the display-users-view.jsp.
Note, by using c:import you don't have to pass any querystring or form params that were submitted to display-users.jsp because you will still have them available in your display-users-view.jsp.
I'm having some problem with java servlet's getParameter() which does not decode param even though I've set Tomcat's encoding properly in server.xml.
<Connector port.. URIEncoding="UTF-8"/>
If I decode raw query I get the decoded query but getParamter does not decode by itself!
protected void service(HttpServletRequest request, HttpServletResponse response)
throws ServletException, IOException {
System.out.println("CharacterEncoding: "+ request.getCharacterEncoding());
System.out.println("Query String: "+ URLDecoder.decode(request.getQueryString(), "UTF-8");
System.out.println("Query param name: "+request.getParameter("name"));
...
The result I get is as follows:
CharacterEncoding: UTF-8
Query String: name=日本語一番ぜソFOX_&'">•«Ç€Ö™»_αß_iİıI_Администратор_cœur d´Ouchy_𠀀𠄂𪛖_عربي
Query param name: æ¥æ¬èªä¸çªãã½ï¼¦ï¼¯ï¼¸_&'">â¢Â«Ãâ¬Ãâ¢Â»_αÃ_iİıI_ÐдминиÑÑÑаÑоÑ_cÅur d´Ouchy_ð ð ðª_عربÙ
you can clearly see the query and name's value are not same !
In my jsp page I'm using <%#page contentType="text/html; charset=UTF-8" %>
I understand that this concerns a GET request. Setting <Connector URIEncoding="UTF-8"> should do it. That it doesn't work can only mean that you're running Tomcat from inside an IDE like Eclipse and that the IDE isn't been configured to take over Tomcat's own configuration while you've edited Tomcat's own configuration in /conf/server.xml.
It's unclear which IDE you're using, but if it were Eclipse, you'd need to either edit the server.xml file in the workspace's Servers project instead, not Tomcat's own /conf/server.xml file
Or configure Eclipse to take control of Tomcat's installation by doubleclicking the Tomcat server entry in Servers view and changing the Server Locations section accordingly.
Back to your investigation/fixing attempts: the request.getCharacterEncoding() isn't been used to decode GET query strings (as that's beyond the control of the Servlet API), it's only been used to decode POST request bodies. The <%#page pageEncoding="UTF-8"%> will only set the character encoding of the response and the subsequent form submits.
See also
Unicode - How to get the characters right?
My Servlet just won't use UTF-8 for JSON responses.
MyServlet.java:
public class MyServlet extends HttpServlet {
protected void doPost(HttpServletRequest req, HttpServletResponse res) throws Exception {
PrintWriter writer = res.getWriter();
res.setCharacterEncoding("UTF-8");
res.setContentType("application/json; charset=UTF-8");
writer.print(getSomeJson());
}
}
But special characters aren't showing up, and when I check the headers that I'm getting back in Firebug, I see Content-Type: application/json;charset=ISO-8859-1.
I did a grep -ri iso . in my Servlet directory, and came up with nothing, so nowhere am I explicitly setting the type to ISO-8859-1.
I should also specify that I'm running this on Tomcat 7 in Eclipse with a J2EE target as a development environment, with Solaris 10 and whatever they call their web server environment (somebody else admins this) as the production environment, and the behavior is the same.
I've also confirmed that the request submitted is UTF-8, and only the response is ISO-8859-1.
Update
I have amended the code to reflect that I am calling PrintWriter before I set the character encoding. I omitted this from my original example, and now I realize that this was the source of my problem. I read here that you have to set character encoding before you call HttpServletResponse.getWriter(), or getWriter will set it to ISO-8859-1 for you.
This was my problem. So the above example should be adjusted to
public class MyServlet extends HttpServlet {
protected void doPost(HttpServletRequest req, HttpServletResponse res) throws Exception {
res.setCharacterEncoding("UTF-8");
res.setContentType("application/json");
PrintWriter writer = res.getWriter();
writer.print(getSomeJson());
}
}
Once the encoding is set for a response, it cannot be changed.
The easiest way to force UTF-8 is to create your own filter which is the first to peek at the response and set the encoding.
Take a look at how Spring 3.0 does this. Even if you can't use Spring in your project, maybe you can get some inspiration (make sure your company policy allows you to get inspiration from open source licenses).
The code looks fine. Either you're not running the code you think you're running, or there's some Filter or proxy somewhere in the request-response chain which modifies the content type like that.
Aside from specific problem, you really should consider getting output stream, using JSON library to write contents directly as UTF-8 encoded JSON; there is no benefit to using writers.
Some JSON packages only work with strings, which is unfortunate, but most allow using more efficient streams (safer and more efficient as parser/generator can handle escaping and encoding aspects together).
I have a developed a Restlet application. I would like to return a JSP file on a URL request through Restlet. How can I achieve this without using a redirect?
i.e.
Let's say I have the file "contact.jsp" on mydomain.com and I want people to be able to access contact.jsp at http://mydomain.com/contact
Thus, in Restlet, I would have:
router.attach("/contact", MyResource.class);
But how can I return the "contact.jsp" page? I know that a redirect would work, but I don't want users to see the ".jsp" in "http://mydomain.com/contact.jsp"... or is there another strategy that would work without even using restlet? Maybe some modification of my web.xml file?
Edit (2009-08-14):
My answer posted below doesn't work on App-Engine and Restlet. It does work however, if I don't include Restlet, or allow Restlet to have a url-pattern of "/*"
What would be ideal is to have a subclass of the Router that allows me to do this:
router.attach("/contact", "/contact.jsp");
Thanks!
Edit (2009-08-17):
I'm surprised I haven't had any responses since I posted a bounty. Will someone comment and let me know if my question/problem isn't clear?
Edit (2009-08-17):
Interesting observation. When using the method described by "Rich Seller" below, it works when deployed on Google App-Engine and not locally. Additionally, If I call http://mydomain.com/contact.jsp on Google App-Engine it bypasses Restlet and goes straight to the JSP. But, locally, Restlet takes over. That is, http://localhost:8080/contact.jsp does not go to the JSP and goes to Restlet. Do deployed app-engine applications respond differently to URLs as their local counterpart?
Restlet doesn't currently support JSPs directly. They're difficult to handle outside of the servlet container.
There's a discussion on Nabble about this issue that you may find useful, at the moment it looks like you need to either return a redirect to the JSP mapped as normal in the web.xml, or hack it to process the JSP and return the stream as the representation.
The response dated "Apr 23, 2009; 03:02pm" in the thread describes how you could do the hack:
if (request instanceof HttpRequest &&
((HttpRequest) request).getHttpCall() instanceof ServletCall) {
ServletCall httpCall = (ServletCall) ((HttpRequest) request).getHttpCall();
// fetch the HTTP dispatcher
RequestDispatcher dispatcher = httpCall.getRequest().getRequestDispatcher("representation.jsp");
HttpServletRequest proxyReq = new HttpServletRequestWrapper(httpCall.getRequest());
// Overload the http response stream to grab the JSP output into a dedicated proxy buffer
// The BufferedServletResponseWrapper is a custom response wrapper that 'hijacks' the
// output of the JSP engine and stores it on the side instead of forwarding it to the original
// HTTP response.
// This is needed to avoid having the JSP engine mess with the actual HTTP stream of the
// current request, which must stay under the control of the restlet engine.
BufferedServletResponseWrapper proxyResp = new BufferedServletResponseWrapper(httpCall.getResponse());
// Add any objects to be encoded in the http request scope
proxyReq.setAttribute("myobjects", someObjects);
// Actual JSP encoding
dispatcher.include(proxyReq, proxyResp);
// Return the content of the proxy buffer
Representation rep = new InputRepresentation(proxyResp.toInputStream(),someMediaType);
The source for the BufferedServletResponseWrapper is posted a couple of entries later.
"I would like to return a JSP file on a URL request through Restlet" - My understanding is JSP's are converted to servlets. Since Servlets are orthogonol to Restlets not sure how you can return JSP file through Restlet.
Assuming you are asking for a way to use JSP in addition to Restlet, This is best achieved by mapping your restlets to a rootpath such as /rest instead of /* and using the .jsp as usual.
Looks like a simple web.xml configuration.
<servlet>
<servlet-name>contactServlet</servlet-name>
<jsp-file>/contact.jsp</jsp-file>
</servlet>
<servlet-mapping>
<servlet-name>contactServlet</servlet-name>
<url-pattern>/contact</url-pattern>
</servlet-mapping>
This works without Restlet in App-Engine. But once I include Restlet, it doesn't work if I set my Reslet url-pattern to "/*"
This question already has answers here:
How to pass Unicode characters as JSP/Servlet request.getParameter?
(5 answers)
Closed 6 years ago.
I have such a link in JSP page with encoding big5
http://hello/world?name=婀ㄉ
And when I input it in browser's URL bar, it will be changed to something like
http://hello/world?name=%23%24%23
And when we want to get this parameter in jsp page, all the characters are corrupted.
And we have set this:
request.setCharacterEncoding("UTF-8"), so all the requests will be converted to UTF8.
But why in this case, it doesn't work ?
Thanks in advance!.
When you enter the URL in browser's address bar, browser may convert the character encoding before URL-encoding. However, this behavior is not well defined, see my question,
Handling Character Encoding in URI on Tomcat
We mostly get UTF-8 and Latin-1 on newer browsers but we get all kinds of encodings (including Big5) in old ones. So it's best to avoid non-ASCII characters in URL entered by user directly.
If the URL is embedded in JSP, you can force it into UTF-8 by generating it like this,
String link = "http://hello/world?name=" + URLEncoder.encode(name, "UTF-8");
On Tomcat, the encoding needs to be specified on Connector like this,
<Connector port="8080" URIEncoding="UTF-8"/>
You also need to use request.setCharacterEncoding("UTF-8") for body encoding but it's not safe to set this in servlet because this only works when the parameter is not processed but other filter or valve may trigger the processing. So you should do it in a filter. Tomcat comes with such a filter in the source distribution.
To avoid fiddling with the server.xml use :
protected static final String CHARSET_FOR_URL_ENCODING = "UTF-8";
protected String encodeString(String baseLink, String parameter)
throws UnsupportedEncodingException {
return String.format(baseLink + "%s",
URLEncoder.encode(parameter, CHARSET_FOR_URL_ENCODING));
}
// Used in the servlet code to generate GET requests
response.sendRedirect(encodeString("userlist?name=", name));
To actually get those parameters on Tomcat you need to do something like :
final String name =
new String(request.getParameter("name").getBytes("iso-8859-1"), "UTF-8");
As apparently (?) request.getParameter URLDecodes() the string and interprets it as iso-8859-1 - or whatever the URIEncoding is set to in the server.xml. For an example of how to get the URIEncoding charset from the server.xml for Tomcat 7 see here
You cannot have non-ASCII characters in an URL - you always need to percent-encode them. When doing so, browsers have difficulties rendering them. Rendering works best if you encode the URL in UTF-8, and then percent-encode it. For your specific URL, this would give http://hello/world?name=%E5%A9%80%E3%84%89 (check your browser what it gives for this specific link). When you get the parameter in JSP, you need to explicitly unquote it, and then decode it from UTF-8, as the browser will send it as-is.
I had a problem with JBoss 7.0, and I think this filter solution also works with Tomcat:
public void doFilter(ServletRequest request, ServletResponse response, FilterChain chain) throws IOException, ServletException {
HttpServletRequest httpRequest = (HttpServletRequest) request;
HttpServletResponse httpResponse = (HttpServletResponse) response;
try {
httpRequest.setCharacterEncoding(MyAppConfig.getAppSetting("System.Character.Encoding"));
String appServer = MyAppConfig.getAppSetting("System.AppServer");
if(appServer.equalsIgnoreCase("JBOSS7")) {
Field requestField = httpRequest.getClass().getDeclaredField("request");
requestField.setAccessible(true);
Object requestValue = requestField.get(httpRequest);
Field coyoteRequestField = requestValue.getClass().getDeclaredField("coyoteRequest");
coyoteRequestField.setAccessible(true);
Object coyoteRequestValue = coyoteRequestField.get(requestValue);
Method getParameters = coyoteRequestValue.getClass().getMethod("getParameters");
Object parameters = getParameters.invoke(coyoteRequestValue);
Method setQueryStringEncoding = parameters.getClass().getMethod("setQueryStringEncoding", String.class);
setQueryStringEncoding.invoke(parameters, MyAppConfig.getAppSetting("System.Character.Encoding"));
Method setEncoding = parameters.getClass().getMethod("setEncoding", String.class);
setEncoding.invoke(parameters, MyAppConfig.getAppSetting("System.Character.Encoding"));
}
} catch (NoSuchMethodException nsme) {
System.err.println(nsme.getLocalizedMessage());
nsme.printStackTrace();
MyLogger.logException(nsme);
} catch (InvocationTargetException ite) {
System.err.println(ite.getLocalizedMessage());
ite.printStackTrace();
MyLogger.logException(ite);
} catch (IllegalAccessException iae) {
System.err.println(iae.getLocalizedMessage());
iae.printStackTrace();
MyLogger.logException(iae);
} catch(Exception e) {
TALogger.logException(e);
}
try {
httpResponse.setCharacterEncoding(MyAppConfig.getAppSetting("System.Character.Encoding"));
} catch(Exception e) {
MyLogger.logException(e);
}
}
I did quite a bit of searching on this issue so this might help others who are experiencing the same problem on tomcat. This is taken from http://wiki.apache.org/tomcat/FAQ/CharacterEncoding.
(How to use UTF-8 everywhere).
Set URIEncoding="UTF-8" on your <Connector> in server.xml. References: HTTP Connector, AJP Connector.
Use a character encoding filter with the default encoding set to UTF-8
Change all your JSPs to include charset name in their contentType.
For example, use <%#page contentType="text/html; charset=UTF-8" %> for the usual JSP pages and <jsp:directive.page contentType="text/html; charset=UTF-8" /> for the pages in XML syntax (aka JSP Documents).
Change all your servlets to set the content type for responses and to include charset name in the content type to be UTF-8.
Use response.setContentType("text/html; charset=UTF-8") or response.setCharacterEncoding("UTF-8").
Change any content-generation libraries you use (Velocity, Freemarker, etc.) to use UTF-8 and to specify UTF-8 in the content type of the responses that they generate.
Disable any valves or filters that may read request parameters before your character encoding filter or jsp page has a chance to set the encoding to UTF-8.