I'm having troubles with uploading and parsing a file as UTF-8 strings. I use the following code:
protected void doPost(HttpServletRequest request, HttpServletResponse response)
throws ServletException, IOException {
Part filePart = request.getPart("file");
InputStream filecontent = filePart.getInputStream();
// ...
}
And my webpage looks like this:
<%# page language="java" contentType="text/html; charset=UTF-8"
pageEncoding="UTF-8"%>
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
</head>
<body>
<form action="UploadServlet" method="post" enctype="multipart/form-data">
<input type="file" name="file" />
<input type="submit" />
</form>
</body>
</html>
I found a great post about UTF-8 encoding in java webapps, but unfortunately it didn't work for me. I still have random symbols in strings in NetBeans debugger, and when I display them on a webpage, although most of them get displayed correctly, some cyrillic letters (я, с, Н, А) get replaced by '�?'
The file upload with a HTML form doesn't use any character encoding. The file is transferred byte by byte as is. See here under "multipart/form-data".
So if the original file at client side is a text file with UTF-8 character encoding, then on the server side it is also UTF-8.
Then you can use an InputStreamReader to decode the bytes as UTF-8 text:
InputStreamReader reader = new InputStreamReader(filecontent, "UTF-8");
That's it.
javax.servlet.http.Part, what you use in the very first line of your code, has a method on it getContentType() which will tell you what the content type of the uploaded form data is. Nothing you have written to date would constrain the uploaded form data to any particular character set; ergo you need to determine the character set and deal with it accordingly.
Related
I'm trying to save HtmlDocument(saved with UTF-8 encoding) which contains Chinese character 𠜎 using HtmlEditorKit in the following way:
try (OutputStreamWriter f = new OutputStreamWriter(fileOutputStream, "UTF-8")) {
htmlEditorKit.write(f, htmlDocument, 0, htmlDocument.getLength());
} catch (BadLocationException e) {
logger.error("Could not save", e);
}
In output HTML doc I'm getting two 2 bytes characters(amp#55361;amp#57102;) instead of one 4 bytes character. Java can understand which symbol is it by combining both of them, but HTML can't.
Any suggestion on how to save it, so HTML page could be correctly displayed?
Here is output html:
<html>
<head>
<meta content="text/html" charset="utf-8">
</head>
<body>
<p>𠜎</p>
</body>
</html>
I try to do a java web app. Everything is good in local tomcat 7 server. I have a jsp file;
<%# page language="java" contentType="text/html; charset=UTF-8"
pageEncoding="UTF-8"%>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
and in this file i send form to my servlet(post) and in my servlet;
request.setCharacterEncoding("UTF-8");
and it works. But in jelastic tomcat server it doesn't work and these turkish characters 'ş','ğ','ı' are inserting to mySql database '?'.
If i update cells, it shown on file true.
What can i do? i try everything on internet but it doesn't change.
Double check the following settings, making sure everyone knows it's UTF-8 party.
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>Page Title</title>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1">
<meta name="format-detection" content="telephone=no" />
</head>
<body>
your html content goes here....
</body>
</html>
Database tables are using utf-8 charset, I don't trust db defaults that's why create table definitions have it.
CREATE DATABASE mydb DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_swedish_ci;
CREATE TABLE tMyTable (
id int(11) NOT NULL auto_increment,
code VARCHAR(20) NOT NULL,
name VARCHAR(20) NOT NULL,
PRIMARY KEY (id)
) ENGINE=InnoDB DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_swedish_ci;
Let JDBC connection know utf-8 charset.
<Resource name="jdbc/mydb" auth="Container" type="javax.sql.DataSource"
maxActive="10" maxIdle="2" maxWait="10000"
username="myuid" password="mypwd"
driverClassName="com.mysql.jdbc.Driver"
url="jdbc:mysql://localhost:3306/mydb?useUnicode=true&characterEncoding=utf8"
validationQuery="SELECT 1"
/>
Some Tomcat versions don't use the same charset origin for GET or POST form requests, so add useBodyEncodingForURI attribute to force GET form parser oboye setCharacterEncoding value.
<Connector port="8080"
maxThreads="150" minSpareThreads="25" maxSpareThreads="75"
enableLookups="false" redirectPort="8443" acceptCount="100"
debug="0" connectionTimeout="20000"
disableUploadTimeout="true" useBodyEncodingForURI="true"
/>
This call must happen before any filter or other code tries to read parameters from the request. So try to call it early as possible.
if (req.getCharacterEncoding() == null)
req.setCharacterEncoding("UTF-8");
Be careful with the whitespace characters in a .jsp page. I use this technique to set multiple tag headers, see how ending and starting tags are next to each other.
<%# taglib prefix="c" uri="http://java.sun.com/jsp/jstl/core" %><%#
page contentType="text/html; charset=UTF-8" pageEncoding="ISO-8859-1"
import="java.util.*,
java.io.*"
%><%
request.setCharacterEncoding("UTF-8");
String myvalue = "hello all and ÅÄÖ";
String param = request.getParameter("fieldName");
myvalue += " " + param;
%><!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>Page Title</title>
<meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1">
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<meta name="format-detection" content="telephone=no" />
</head>
<body>
your html content goes here.... <%= myvalue %>
</body>
JSP page contentType attribute is the one set in http response object and pageEncoding is the one being used in a disk file. They don't need to match and I usually use ISO-8859-1 if page is only using safe us-ascii characters. Don't use UTF8WithBOM format because hidden leading bom marker bytes may create problems in some J2EE servers.
Last thing is how you write strings to the response stream, if you convert strings to bytes make sure it's using utf-8 and let client know it.
response.setContentType("text/html; charset=UTF-8");
response.getOutputStream().write( myData.getBytes("UTF-8") );
This was a long post but it pretty much covers most corner issues.
The phrase "call it early as possible" in Whome's answer above hit the spot.
protected void doPost(HttpServletRequest request, HttpServletResponse response)
throws ServletException, IOException {
if (request.getCharacterEncoding() == null) {
request.setCharacterEncoding("UTF-8");
}
String command = request.getParameter("command");
...
works. However,
protected void doPost(HttpServletRequest request, HttpServletResponse response)
throws ServletException, IOException {
String command = request.getParameter("command");
if (request.getCharacterEncoding() == null) {
request.setCharacterEncoding("UTF-8");
}
...
doesn't work.
<%#page import="java.net.URLDecoder"%>
<%# page language="java" contentType="text/html; charset=ISO-8859-1"
pageEncoding="ISO-8859-1"%>
<%#page import="java.net.URLDecoder"%>
<%#page import="java.net.URLEncoder"%>
<html>
<form action="index.jsp">
<body>
First INPUT:
<input name="firstinput" type="text" name="fname">
<br>
<input type="submit" value="Submit">
<%
String first = request.getParameter("firstinput");
String Searchtext=URLDecoder.decode(first,"UTF-8");
out.println(Searchtext);
out.println(URLEncoder.encode(Searchtext,"UTF-8"));
%>
</body>
</form>
</html>
This is My code I want to Encode and Decode text in Jsp Actully I want that when Input Text " ",' ',/ /...any special charter it should print same as it is text like if Input "hello" or hello then it should Print hello or if input 'hello' then also it should Print hello... special charter should Not display please help me i am Unable to do this ...
I think you need this:
String lWithoutSpecials = first.replaceAll("[^\\p{Alpha}]+","");
For me it works great:
String s = "\\Hello\\ \"Hello\" 'Hello'";
String lWithoutSpecials = s.replaceAll("[^\\p{Alpha}]+", "");
System.out.println(lWithoutSpecials);
Output:
HelloHelloHello
You are not using full Unicode but Latin-1, ISO-8859-1. This Latin-1 will browsers
interprete as MS Windows Latin-1, or "Cp-1252"/"Windows-1252". This charset has some special characters like comma like quotes, € (euro), etcetera.
URL encoding/decoding is done automatically. The data entry of the input may
cause numeric HTML entities to arrive at the server, like Ӓ when having a restricted charset like Latin-1. With UTF-8 for the entire Unicode characters you need to add to the <form accept-charset="UTF8"> to prevent substitution by numeric entities.
A HTML 5 form:
<%#page language="java" contentType="text/html; charset=Windows-1252"
pageEncoding="Windows-1252"
import="java.net.URLDecoder"
import="java.net.URLEncoder"
%><!DOCTYPE html>
<html>
<head>
<title>First Input</title>
<meta charset="ISO-8859-1">
</head>
<body>
<form action="index.jsp">
First INPUT:
<input name="firstinput" type="text"
value="${param.firstinput}">
<br>
<input type="submit" value="Submit">
<%
String first = request.getParameter("firstinput");
String searchtext = first;
out.println(searchtext);
%>
</form>
</body>
</html>
It lies saying its charset is the limited ISO-8859-1, but java delivers the larger charset Windows-1252.
The tag <form> must be inside the <body>. If you did that for form margins and such, use CSS styles.
The goal is to upload large file (video) and get the public shared url for them.
Looks like it's pretty straightforward but i spend a bit more than one day going into the documentation and i didn't find any sample of that.
I get the following code to make an upload to Google Store which works fine, but i would like to add the option in the url to make the acl of the file : "public-read". Either prior the upload in the jsp or after in the servlet.
<%# page import="com.google.appengine.api.blobstore.BlobstoreServiceFactory" %>
<%# page import="com.google.appengine.api.blobstore.BlobstoreService" %>
<%# page import="com.google.appengine.api.blobstore.UploadOptions" %>
<%
BlobstoreService blobstoreService = BlobstoreServiceFactory.getBlobstoreService();
String uploadUrl = blobstoreService.createUploadUrl("/ajax?act=user&act2=video_upload", UploadOptions.Builder.withGoogleStorageBucketName("vidaao"));
%>
<%# page language="java" contentType="text/html; charset=ISO-8859-1" pageEncoding="ISO-8859-1"%>
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
<title>Upload Page</title>
</head>
<body>
<h1>Upload v3</h1>
<form name="form1" id="form1" action="<% out.print(uploadUrl); %>" method="post" enctype="multipart/form-data" target="upload_iframe">
<input type="hidden" name="hiddenfield1" value="ok">
Files to upload:
<br/>
<input type="file" name="myFile">
<br/>
<button type="submit">Send</button>
</form>
<iframe id="upload_iframe" name="upload_iframe"></iframe>
</body>
</html>
Then in my servlet somewhere the redirect url ends there with the generation of the blobkey
public String upload(HttpServletRequest req, HttpServletResponse res) throws Exception{
Map<String, BlobKey> blobs = blobstoreService.getUploadedBlobs(req);
BlobKey blobKey = blobs.get("myFile");
if (blobKey == null) {
throw new Exception("Error file not uploaded");
}
//TODO: HERE get the public shared url of the file
return " blob key = " + blobKey.getKeyString();
}
And at this step i would like to have the public shared url for the Google Cloud Storage, if it's possible.
(I cannot serve the file through a servlet because it might time out)
By default files uploaded to bigstore using createUploadUrl are private. You would need to modify the ACL yourself to make it public.
Also, you can use serve() to return blobs of unlimited size from Google Storage from your servlet without concerns for timeout if you would prefer to do it that way rather than making the blobs public.
I want to invoke my jsp page from my index.html.This is html code.
<html>
<head>
<title></title>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<form action="DownloadFile.jsp">
<body>
<div>Click here Download File from Server...</div>
<input type="submit" name="downloadButton" value="Download..." />
</body>
</form>
</html>
JSP PAGE:
<%
String filename = "Sample1.zip";
String filepath = "e:\\temp\\";
response.setContentType("APPLICATION/OCTET-STREAM");
response.setHeader("Content-Disposition","attachment; filename=\"" + filename + "\"");
java.io.FileInputStream fileInputStream = new java.io.FileInputStream(filepath + filename);
int i;
while ((i=fileInputStream.read()) != -1) {
out.write(i);
}
fileInputStream.close();
%>
But When i pressing Download Button, it just shows the jsp file content as html.it does not start downloading the file anyway. What is the problem here...
and also i cann't download .docx and .jpg files correctly.It says file may be corrupted...
Please Guide me to get out of this both issues...
Is there a common way to download all types of files in jsp?
Your server either does not support JSP or is not configured for it.
You need a JSP capable server.
Have you configured the Servlet engine with you webserver and done the setup for forwarding request for jsp files to server engine.
Actually, you do not need jsp to download you content . Instead, if you want to download from the client end, you can use html5
<!DOCTYPE html>
<html>
<body>
<p>Click on the below hyperlink to download the any such file:<p>
<a href="5.csv" download>
test
</a>