How convert mail body from iso-8859-2 into utf-8 - java

I'm using JavaMail to handle mail from my mailbox, and I have a following problem.
The content type of a mail is: Content-Type: text/plain; charset=iso-8859-2 (that's only in this case, I have to handle various mails with cp-1250 and many others).
And to get the body of this mail I'm using:
mimeMessage.getContent().toString()
and the content of the mail is returned but is not converted into utf-8, it's still in iso-8859-2 with causes generate text:
w ďż˝omďż˝y. instead of w Łomży.
How can I handle bodies with different than utf-8 encoding to convert it to utf? Is there any possibility to make it in JavaMail, that means JavaMail can convert an mail body from defined charset into utf-8?

Related

Set "mail.strictly_mime.parm_folding" in javamail

I'do use javamail to send mail with long filename attachments. The javamail acts accordingly to more recent RFC, and span the filename in two lines of the mail's header, like this example:
------=_Part_0_978693914.1433356404377
Content-Disposition: ATTACHMENT;
filename*0="=?UTF-8?Q?arquivo_com_nome_grande_e_acentua=C3=A7=C3=A3o.png\"; f";
filename*1="ilename*1=\"?="
Content-Type: APPLICATION/OCTET-STREAM;
name*0="=?UTF-8?Q?arquivo_com_nome_grande_e_acentua=C3=A7=C3=A3o.png\"; n";
name*1="ame*1=\"?="
Content-Transfer-Encoding: BASE64
Mail clients like Outlook don't understand it, so I need to make javamail don;t split the filename in two lines.
Reading the RFC, I found an attribute that says to don't split:
"mail.strictly_mime.parm_folding"
How do I set it in javamail?
The mail.strictly_mime.parm_folding property is for Thunderbird, it's not in the RFC.
According to this Thunderbird article, Outlook doesn't support RFC 2231, which JavaMail is using to encode the filename parameter. You can disable RFC 2231 encoding by setting the JavaMail System property "mail.mime.encodeparameters" to "false". You'll probably want to set the System property "mail.mime.encodefilename" to "true" to use the non-standard filename encoding that Outlook supports.
I found this problem on Wildfly Server V.10.x
Solving by insert format="flowed" into Content Type
MimeBodyPart part = new MimeBodyPart();
part.addHeader("Content-Type", "application/pdf; charset=\"UTF-8\"; format=\"flowed\" ");
part.setFileName(MimeUtility.encodeText(file.getName(), "UTF-8", null));
//setDataHandler

JavaMail - Attachment filename not displaying UTF-8 characters correctly

I am trying to send mails that may contain UTF-8 characters in subject, message body and in the attachment file name.
I am able to send UTF-8 characters as a part of Subject as well as Mesage body. However when I am sending an attachment having UTF-8 characters as a attachment file name, it is not displaying it correctly.
So my question is how can I set attachement filename as UTF-8?
Here is part of my code:
MimeBodyPart pdfPart = new MimeBodyPart();
pdfPart.setDataHandler(new DataHandler(ds));
pdfPart.setFileName(filename);
mimeMultipart.addBodyPart(pdfPart);
Later edit:
I replaced
pdfPart.setFileName(filename);
with
pdfPart.setFileName(MimeUtility.encodeText(filename, "UTF-8", null));
and it is working perfectly.
Thanks all.
MIME Headers (like Subject or Content-Disposition) must be mime-encoded, if they contain non-ascii chars.
Encoding is either "quoted printable" or "base64". I recommend for quoted-printable.
See here: Java: Encode String in quoted-printable
I don't know how you send attachments. If upload through tomcat server, It could cause by value of URIEncoding in conf/server.xml

Chinese lettes not read properly using Java Mail API

I am having an email listener which reads mail from gmail. When I send a mail from Outlook client which contains chinese character, the encoding is set to gb2312, which causes improper result in part.getContent() in Java mail api .
If encoding from client is set to Chinese Big5 program works properly but we can't change the encoding in Outlook Client . Is there a way to read from Java Mail API but setting the content type or any alternate approach to get the proper content??????
https://community.oracle.com/message/5440489#5440489
Used GBK charset to read the file for all GB2312 file since gb2312 is a subset of GBK.
The following then should work with a bit of luck:
String content = mail. ...
// The bytes as sent, and then interpreted as gb2312:
byte[] bytes = content.getBytes("gb2312");
// Now correctly interprete the bytes as Big5:
content = new String(bytes, "Big5");

Send winmail format to Outlook and trigger Confirmitaion (email and ics should not be separated)

I would like to know what make Outlook respond to invitations send from another Outlook in the following way:
If not clear from the screenshot, Outlook in this case, asks me to confirm my attendance.
I have a program that sends an ICAL file. The ICAL file is properly sent as an attachment.
The file looks like this:
I have a program that sends an ICAL file. The ICAL file is properly sent as an attachment.
BEGIN:VCALENDAR
PRODID:TODO
VERSION:2.0
METHOD:PUBLISH
BEGIN:VEVENT
CLASS:PUBLIC
DESCRIPTION:Parameter: Value\nAuftrags-Nr.: \nVorschrift: 12\nZyklus: 12\nKommentar_1: \nKommentar_2: \nKommentar_3: 12\nPr?fstand: TODO\nV-Nr.: \nSMKL: 2\nDatum-Startzeit: TODO\nModel-Typschluessel: TODO\nCoastDowm: TODO\nBerechnen:
+TODO\nKommentar_4: TODO\nKommentar_5: TODO\nSchaltpunkttabelle: TODO\nAdd Test: TODO\nAdd Messtechnik: TODO\nKonfiguration MT: TODO\nAnwesenheit SB: TODO\n
ATTENDEE;CN=Pr?fstand; RSVP=TRUE:oz#domain.com
DTSTART:20130123T131951Z
DTEND:20130123T151951Z
DTSTAMP:20130123T131956Z
LOCATION:12
ORGANIZER;CN=wurst:MAILTO:wurst#wurstkeuche.de
PRIORITY:5
SEQUENCE:0
SUMMARY;LANGUAGE=de:Abgastest
TRANSP:OPAQUE
UID:ac4fc017-0944-4f9f-bfd1-3ffc07b486a9
X-MICROSOFT-CDO-BUSYSTATUS:FREE
X-MICROSOFT-CDO-IMPORTANCE:1
X-MICROSOFT-DISALLOW-COUNTER:TRUE
X-MS-OLK-ALLOWEXTERNCHECK:TRUE
X-MS-OLK-AUTOFILLLOCATION:FALSE
X-MS-OLK-CONFTYPE:0
BEGIN:VALARM
TRIGGER:-PT15
ACTION:DISPLAY
DESCRIPTION:Reminder
END:VALARM
END:VEVENT
END:VCALENDAR
When received in Outlook, it is seen as an Attachment:
I looked in the E-Mail properties and managed to find the following differences:
Outlooks sends the appointment in some binary file :
Content-Type: application/ms-tnef; name="winmail.dat"
Content-Transfer-Encoding: binary
My program sends:
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
when I forward the "correct" appointment email (with "winmail.dat" type) from Outlook to myself and open it with mutt, I see the following, the email is composed of a few pieces:
I 1 <no description> [multipa/alternativ, 7bit, 8.1K]
I 2 ├─><no description> [text/plain, base64, utf-8, 1.4K]
I 3 ├─><no description> [text/html, base64, utf-8, 2.8K]
I 4 └─><no description> [text/calendar, base64, utf-8, 3.3K]
If I forward the same mail to mutt, again back to Outlook, it looks like this:
The content of the mail is:
Content-Type: multipart/mixed; boundary="bKyqfOwhbdpXa4YI"
Content-Disposition: inline
So, I suspect that the behavior I want to achieve is control inside the winmail.dat and not with a parameter inside the ICS file.
I must also add, my code is in JAVA, and reading about winmail.dat I found a JAVA library that creates winmail.dat. But I don't know what property in side the binary
format will trigger this behavior.
My first question is then:
Can I emulate this behavior using text mails only?
The second question is:
If you can't emulate this behavior (probably not) in plain text, does someone know the right property to set in the binary format?
Outlook will be perfectly happy if you will send the invitation as a MIME message with the content type of "text/calendar; method=REQUEST".
There is no reason to use winmail.dat.

Jersey REST WS - request body UTF-8

I have simple Jersey REST webServices:
#POST
#Path("/label")
#Consumes(MediaType.TEXT_HTML)
public Response setLabels(String requestBody) {
System.out.println(requestBody);
......
}
Request passes some text with "special" non-English characters
[{"За обекта"}]
I can see in Firebug that request passed with correct UTF-8 content and charset
Content-Type text/plain; charset=UTF-8
Though on on server output does not present desirable charset:
[{"?? ??????"}]
Any Idea what and were went wrong? How can I capture text in correct charset on server side?
System.out is a PrintStream. It uses the platform default encoding, which is typically not UTF-8. So you are getting the correct data in, it's just getting mangled when you print it to the console.
I had the exact same problem a few weeks ago - drove me nuts until I figured it out. What made it worse is that I actually had an encoding-related bug in another part of the code.

Categories