How to upload file with non-ASCII filename using REST-service? - java

I create Java 7 REST service using Spring, Apache CXF.
public SuccessfulResponse uploadFile(#Multipart("report") Attachment attachment)
I use "Content-Disposition" parameter in order to retrieve file name. I've read some solutions which are usedfor downloading files (for example, url-encoding). But how to cope with non-ASCII filenames for upload? Is it a client-side or a server-side solution? The signature of the method above can be changed. The client side uses html5 file api + iframe.

My experience is that content-disposition doesn't handle UTF8 in general. You can simply add another multipart field for the filename - multipart fields support charset indication and handles UTF8 chars if done correctly.

You can use the UTF8 for a filename (according to https://www.rfc-editor.org/rfc/rfc6266 and https://www.rfc-editor.org/rfc/rfc5987). For the Spring the easiest way is to use org.springframework.http.ContentDisposition class. For example:
ContentDisposition disposition = ContentDisposition
.builder("attachment")
.filename("репорт.export", StandardCharsets.UTF_8)
.build();
return ResponseEntity
.ok()
.contentType(MediaType.APPLICATION_OCTET_STREAM)
.header(HttpHeaders.CONTENT_DISPOSITION, disposition.toString())
.body((out) -> messageService.exportMessages(out));
It is an example to send the file from the server (download). To upload the file you can follow the same RFC, even the Content-Disposition header must be prepared on the browser, by JavaScript for example, and looks like:
Content-Disposition: attachment;
filename="EURO rates";
filename*=utf-8''%e2%82%ac%20rates
Parameter filename is optional in this case and is a fallback for the systems that don't support the RFC 6266 (it contains the ASCII file name). Value for filename* must the URL Encoded (https://www.url-encode-decode.com).

Related

Multipart Request and Custom Header

For my current solution, I'm using apache commons FileUpload library to process incoming multi-part requests. I'm able to send the files appropriately then read the stream on the server end using the streaming api code here.
If you look at the format of multipart requests listed here, there is a Content-Disposition listed for each file added. I need to add a startByte tag (similar to how you add a "filename" tag in content-disposition). I'm not too sure how to do that properly and then retrieve it in the request? This is of course not a global header, because multiple files are in this stream.
Anyone have any ideas?
This is for anyone who might be interested, and turns out was easy: To do this, on the client, you append the header like below:
outputStream.writeBytes("Content-Disposition: form-data; name=\"" + filename + "\"; filename=\"" + filename + "\";\r\n");
outputStream.writeBytes("My-Custom-Header: My-Data\r\n");
outputStream.writeBytes("\r\n");
Then, on the server, using commons FileUpload, you would just do:
FileItemHeaders headers = item.getHeaders();
headers.getHeader("My-Custom-Header");

Content-disposition header field not understood by IE, but works with Chrome

I am able to see properly the file name in Chrome but not in IE getting the encoded format for filename
(e.g. =?UTF-8?B?5qWt55WM5pSv5Ye6UERGXzIwMTUwMjEwMTEwNjIy?=).
fileName = "=?UTF-8?B?" + new String(Base64.encodeBase64(fileName.getBytes("UTF-8")), "UTF-8") + "?=";
Please help me on this.
There is no nice cross browser solution for this. I found the nicely named article Test Cases for HTTP Content-Disposition header field (RFC 6266) and the Encodings defined in RFCs 2047, 2231 and 5987 which summarises what different browsers support. Although I couldn't see anything on the base64 mechanism you're using.
There are many different variants, try using the URL encoding form, see http://greenbytes.de/tech/tc2231/#attwithfn2231utf8. Example
Content-Disposition: attachment; filename*=UTF-8''foo-%c3%a4-%e2%82%ac.html
This claims better support, assuming you only want > IE8
FF22 pass
MSIE8 unsupported
MSIE9 pass
Opera pass
Saf6 pass
Konq pass
Chr25 pass
Depending on what version of IE you're hoping to support you may have to do some dirty USER_AGENT sniffing and send a header.
if (isUserAgentIe(requestHeaders)) {
fileName = ...
} else {
fileName = ...
}

JavaMail - Attachment filename not displaying UTF-8 characters correctly

I am trying to send mails that may contain UTF-8 characters in subject, message body and in the attachment file name.
I am able to send UTF-8 characters as a part of Subject as well as Mesage body. However when I am sending an attachment having UTF-8 characters as a attachment file name, it is not displaying it correctly.
So my question is how can I set attachement filename as UTF-8?
Here is part of my code:
MimeBodyPart pdfPart = new MimeBodyPart();
pdfPart.setDataHandler(new DataHandler(ds));
pdfPart.setFileName(filename);
mimeMultipart.addBodyPart(pdfPart);
Later edit:
I replaced
pdfPart.setFileName(filename);
with
pdfPart.setFileName(MimeUtility.encodeText(filename, "UTF-8", null));
and it is working perfectly.
Thanks all.
MIME Headers (like Subject or Content-Disposition) must be mime-encoded, if they contain non-ascii chars.
Encoding is either "quoted printable" or "base64". I recommend for quoted-printable.
See here: Java: Encode String in quoted-printable
I don't know how you send attachments. If upload through tomcat server, It could cause by value of URIEncoding in conf/server.xml

Uploading JSON and binary file in one request

I am looking to create a RESTful API for use with an Android and iOS app. So far I have been experimenting with using Jersey on the server and then the appropriate http libraries on the client side. At the moment I have been using multipart/related as the mimetype for the request with JSON forming the first part of the body then a jpeg image as the second.
So far I have had problems with making the request to the server, getting a 406 Not Acceptable from Jersey. I note that multipart/related is primarily used in sending emails. Is there actually a way that I can support mixed type content as an upload or have I entirely mis-understood the usage of multipart/related in this context?
You may want to look at this blog, for more information, but here is the important part to help you along:
http://www.mkyong.com/webservices/jax-rs/file-upload-example-in-jersey/
#POST
#Path("/upload")
#Consumes(MediaType.MULTIPART_FORM_DATA)
public Response uploadFile(
#FormDataParam("file") InputStream uploadedInputStream,
#FormDataParam("file") FormDataContentDisposition fileDetail) {
String uploadedFileLocation = "d://uploaded/" + fileDetail.getFileName();
// save it
writeToFile(uploadedInputStream, uploadedFileLocation);
String output = "File uploaded to : " + uploadedFileLocation;
return Response.status(200).entity(output).build();
}
I expect you want multipart/form-data instead, as this is part of the description of multipart/related:
The Multipart/Related media type is intended for compound objects
consisting of several inter-related body parts. For a
Multipart/Related object, proper display cannot be achieved by
individually displaying the constituent body parts. The content-type
of the Multipart/Related object is specified by the type parameter.
The "start" parameter, if given, points, via a content-ID, to the
body part that contains the object root. The default root is the
first body part within the Multipart/Related body.
For more on this mime type you can look at
https://www.rfc-editor.org/rfc/rfc2387
If you are wanting to submit image along with the json body, you can base64 encode the image and include the base64 string in the json. Then on the server side, you base64 decode the string and upload the image file to the blobstore. See the file upload example (at the bottom of the page) here https://developers.google.com/appengine/docs/java/blobstore/overview
Alternatively, you could do a separate upload to the blobstore and get the blobkey for the uploaded image. You can then include the blobkey in the json body that you post to the server.Using this approach you would need to get the uploadurl every time you need to do a new image upload.

International characters in filename in mutipart formdata

I am using Apache HTTP components (4.1-alpha2) to upload a files to dropbox. This is done using multipart form data. What is the correct way to encode filenames in in a multipart form that contain international (non-ascii) characters?
If I use there standard API, the server returns an HTTP status Forbidden. If I modify the upload code so the file name is urlencoded:
MultipartEntity entity = new MultipartEntity(HttpMultipartMode.BROWSER_COMPATIBLE);
FileBody bin = new FileBody(file_obj, URLEncoder.encode(file_obj.getName(), HTTP.UTF_8), HTTP.UTF_8, HTTP.OCTET_STREAM_TYPE );
entity.addPart("file", bin);
req.setEntity(entity);
The file is uploaded, but I end up with a filename that is still encoded. E.g. %D1%82%D0%B5%D1%81%D1%82.txt
To solve this issue specifically for the dropbox server I had to encode the filename in utf8. To do this I had to declare my multipart entity as follows:
MultipartEntity entity = new MultipartEntity(HttpMultipartMode.BROWSER_COMPATIBLE, null, Charset.forName(HTTP.UTF_8));
I was getting the forbidden because of the OAuth signed entity not matching the actual entity sent (it was being URL encoded).
For those interested on what the standards have to say on this I did some reading of RFCs.
If the standard is strictly adhered then all headers should be encoded 7bit, this would make utf8 encoding of the filename illegal. However RFC2388 () states:
The original local file name may be
supplied as well, either as a
"filename" parameter either of the
"content-disposition: form-data"
header or, in the case of multiple
files, in a "content-disposition:
file" header of the subpart. The
sending application MAY supply a file
name; if the file name of the sender's
operating system is not in US-ASCII,
the file name might be approximated,
or encoded using the method of RFC
2231.
Many posts mention using either rfc2231 or rfc2047 for encoding headers in non US-ASCII in 7bit. However rfc2047 explicitly states in section 5.3 encoded words MUST NOT be used on a Content-Disposition field. This would only leave rfc2231, this however is an extension and cannot be relied upon being implemented in all servers. The reality is most of the major browsers send non-US-ASCII characters in UTF-8 (hence the HttpMultipartMode.BROWSER_COMPATIBLE mode in Apache HTTP client), and because of this most web servers will support this. Another thing to note is that if you use HttpMultipartMode.STRICT on the multipart entity, the library will actually substitute non-ASCII for question mark (?) in the filename.S
I would have thought that the implementation of the FileBody would take responsibility for applying the appropriate rules from RFC 2047 itself. The filename would then be encoded as =?UTF-8?Q?=D1=82=D0=B5=D1=81=D1=82.txt?= or something very similar.
Quick fix:
new String(multipartFile.getOriginalFilename().getBytes ("iso-8859-1"), "UTF-8");

Categories