HTML entities incorrectly encoding in API call

HTML entities incorrectly encoding in API call - java

I am about to use Paxful API method for sending a message to trade:
String message = "Do NOT PRESS 'Paid' button, until your transfer get 'Success' status.";
paxfulService.sendMessage(tradeId, message);
But here is what I see in the browser:
This is my fault, or Paxful API use unnecessary HTML encoding?

The API you are using is re-encoding the single quotes back into their hex values.
In your original message string try using ' in the place of the single quotes you have.

Related

Play Framework - receiving email through SendGrid - character encoding of email body

I am developing a small mail client in the Java Play Framework and I'm using SendGrid for the e-mails. When an e-mail is received, it gets posted to a url and I then parse the posted form using JsonNode. Now the problem is the "to", "from", "subject" fields of that form are automatically converted by SendGrid to UTF-8. Now comes the problem: apparently, the email message body is encoded in "ISO-8859-1". And I need to convert that String to "UTF-8". I already tried several ways of doing so, but most probably I'm doing something very wrong, since I always get strange characters for French or German words containing accents/umlauts (Example "Zürich" comes out as "Z?rich". The code I'm using for the conversion is the following:
byte[] msg = message.getBytes("ISO-8859-1");
byte[] msg_utf8 = new String(msg, "ISO-8859-1").getBytes("UTF-8");
message = new String(msg_utf8, "UTF-8");
Could you, please, suggest a solution? Thank you very much in advance!

Ok so I managed to get the raw byte request from SendGrid using the annotation and created the java String with the correct encodings:
#BodyParser.Of(BodyParser.Raw.class)
public static Result getmail() {
...
}
Now the problem is that for retrieving the file attachments from the request I would need the request to be parsed as MultipartFormData. With the annotation above set, I get a NullPointerException when calling, which was predictable:
request().body().asMultipartFormData().getFiles()
Does any of you have any idea on how I could get the same request again, but parsed with the #BodyParser.Of(Bodyparser.MultipartFormData.class) ? So I kind of need to combine the two annotations or find a way to convert the byte[] I get from the Raw parser to a MultiFormData. Thanks!

Set [Automatically wrap text] in Java mail

i have an account register function, after user inputted personal data, an confirm email will be sent to that customer with a generated link. The problem is that: because the link is too long, it is broken into two lines (The second line is from character 76) and the second line does not belong the the first line (User cannot click on the whole link). I think this problem may come from the word wrap or something like that
In Outlook Express, under menu->Tools->Options->Send->HTML setting, we can set number of characters that the email content should be wrapped in each line by changing the value. Is there any way to set this function using core Java Mail?
Thank you in advance.

Word wrapping is done by the viewer (i.e. Outlook Express) not when sending email. I would guess that you are sending plain text emails and relying on the viewers to try and identify that it contains links. Try sending HTML mail and using ''

No, JavaMail is a library allowing you to send/receive email through Java. It is not an application like Outlook/Outlook Express or Thunderbird for that matter.
That said, you can write code that does the formatting before it invokes JavaMail to send the email out.

First, you can't set a setting in java mail to change a client's formatting.
Second, while my solution might not be the best answer to the question. It should help with the problem you are having.
Before adding your link into the body of the mail make sure you;
Put the link on a new line. "\n" ;)
Make a little method using URL shortening API like bitlyj for bit.ly to shorten the URL. Add the shortened link and walla!
msg.setContent("This is an example of adding a shortened URL\n"
+ shortLink("http://www.longlink.com")
+ "\n", "text/plain");
public String shortLink(String link) {
Url url = as("Username", "APIKey").call(shorten(link));
return url.getShortUrl();
}
Using this approach you shouldn't have any issues with word wrap stuff.

How to send parameters with same encoding from javascript?

I have a javascript file that lots of people have embedded to their pages. Since I am hosting the file, I have control over that javascript file; I cannot control the way it is embedded because lots of people is using it already.
This javascript file sends GET requests to my servlets, and the parameters passed with the request are recorded to DB. For example, javascript sends a request to http://myserver.com/servlet?p1=123&p2=aString and then servlet records 123 and aString to DB somehow.
Before sending strings I use encodeURIComponent() to encode it. But what I figured out is every client sends the same string with different encodings depending on either their browser or the site they are visiting. As a result, same strings are represented with different characters when it reaches servlet (so they are different strings).
What I am trying to do is to convert the strings to one kind of encoding from javascript so when they reach the client same words are represented with same characters.
How is this possible?
PS. If there is a way to convert the encoding from Java it is also applicable.
Edit: To be more precise, I select some words from the page and send it to the server. That is where encoding causes problems.
Edit 2: I am NOT sending (and can't send) GET requests via XMLHttpRequest, because domains are different. I am using adding script tag to head method that #streetpc mentioned.
Edit 3: At the moment I am sanitizing the strings by replacing non-ASCII characters at javascript side, but I have a feeling that this is not the way to go:
function sanitize(word) {
/*
ğ : \u011f
ü : \u00fc
ş : \u015f
ö : \u00f6
ç : \u00e7
ı : \u0131
û : \u00fb
*/
return encodeURIComponent(
word.replace(/\u011f/g, '_g')
.replace(/\u00fc/g, '_u')
.replace(/\u00fb/g, '_u')
.replace(/\u015f/g, '_s')
.replace(/\u00f6/g, '_o')
.replace(/\u00e7/g, '_c')
.replace(/\u0131/g, '_i'));
}

what I figured out is every client sends the same string with different encodings
Whilst that would be normal for <form> submissions, it should not happen for XMLHttpRequest work. The encodeURIComponent function explicitly always writes URL-encoded UTF-8 bytes, regardless of the encoding of the page from which it was used. Of course persuading your servlet container to allow you to read those UTF-8 bytes without messing them up is another story, but that shouldn't depend on the client.
What might be a problem is if you are using raw non-ASCII characters inside your script file itself. In that case the interpretation of those characters will vary according to the charset the browser is using to load the script. This may be affected by:
any charset declared in the Content-Type: text/javascript;charset= header.
any charset attribute declared on the <script src="..." charset="..."> element.
the charset of the page that included the script.
(1) and (2) are not supported in all browsers. Normally you can rely on (3), but as a third-party script author that is out of your control. Therefore you should use only ASCII characters in your script. (Use \u1234 escapes to include non-ASCII characters in string literals in your script to get around this limitation.)

Do you specify the encoding of the JavaScript file in the HTTP headers? Like Content-type: text/javascript; charset=utf-8 with the .js file beign saved in UTF-8 of course. With Apache, you can configure
AddCharset utf-8 .js
Or you can make the hosted javascript file create another script tag with a charset='utf-8' parameter and add-it to the head element (like most bookmarklets do).
I think the javascript being interpreted as UTF-8 code should then get/manipulate UTF-8 strings.
Then, in your Java Servlet, you can specify the input encoding to use:
request.setCharacterEncoding("UTF-8");
Edit: check this page about Character Encoding in JavaScript, especially the part named "Setting the Character Encoding".

Java string encoding conversion within a webpage

I have a webpage that is encoded (through its header) as WIN-1255.
A Java program creates text string that are automatically embedded in the page. The problem is that the original strings are encoded in UTF-8, thus creating a Gibberish text field in the page.
Unfortunately, I can not change the page encoding - it's required by a customer propriety system.
Any ideas?
UPDATE:
The page I'm creating is an RSS feed that needs to be set to WIN-1255, showing information taken from another feed that is encoded in UTF-8.
SECOND UPDATE:
Thanks for all the responses. I've managed to convert th string, and yet, Gibberish. Problem was that XML encoding should be set in addition to the header encoding.
Adam

To the point, you need to set the encoding of the response writer. With only a response header you're basically only instructing the client application which encoding to use to interpret/display the page. This ain't going to work if the response itself is written with a different encoding.
The context where you have this problem is entirely unclear (please elaborate about it as well in future problems like this), so here are several solutions:
If it is JSP, you need to set the following in top of JSP to set the response encoding:
<%# page pageEncoding="WIN-1255" %>
If it is Servlet, you need to set the following before any first flush to set the response encoding:
response.setCharacterEncoding("WIN-1255");
Both by the way automagically implicitly set the Content-Type response header with a charset parameter to instruct the client to use the same encoding to interpret/display the page. Also see this article for more information.
If it is a homegrown application which relies on the basic java.net and/or java.io API's, then you need to write the characters through an OutputStreamWriter which is constructed using the constructor taking 2 arguments wherein you can specify the encoding:
Writer writer = new OutputStreamWriter(someOutputStream, "WIN-1255");

Assuming you have control of the original (properly represented) strings, and simply need to output them in win-1255:
import java.nio.charset.*;
import java.nio.*;
Charset win1255 = Charset.forName("windows-1255");
ByteBuffer bb = win1255.encode(someString);
byte[] ba = new byte[bb.limit()];
Then, simply write the contents of ba at the appropriate place.
EDIT: What you do with ba depends on your environment. For instance, if you're using servlets, you might do:
ServletOutputStream os = ...
os.write(ba);
We also should not overlook the possible approach of calling setContentType("text/html; charset=windows-1255") (setContentType), then using getWriter normally. You did not make completely clear if windows-1255 was being set in a meta tag or in the HTTP response header.
You clarified that you have a UTF-8 file that you need to decode. If you're not already decoding the UTF-8 strings properly, this should no big deal. Just look at InputStreamReader(someInputStream, Charset.forName("utf-8"))

What's embedding the data in the page? Either it should read it as text (in UTF-8) and then write it out again in the web page's encoding (Win-1255) or you should change the Java program to create the files (or whatever) in Win-1255 to start with.
If you can give more details about how the system works (what's generating the web page? How does it interact with the Java program?) then it will make things a lot clearer.

The page I'm creating is an RSS feed that needs to be set to WIN-1255, showing information taken from another feed that is encoded in UTF-8.
In this case, use a parser to load the UTF-8 XML. This should correctly decode the data to UTF-16 character data (Java Strings are always UTF-16). Your output mechanism should encode from UTF-16 to Windows-1255.

byte[] originalUtf8;//Here input
//utf-8 to java String:
String internal = new String(originalUtf8,Charset.forName("utf-8");
//java string to w1255 String
byte[] win1255 = internal.getBytes(Charset.forName("cp1255"));
//Here output

why is '<' showing as <

I am outputting a string form my java class like this
String numQsAdded = "<div id='message1'>"+getQuestion()+"</div>";
This string is being sent back to the client side as a XMLHttpRequest. So, in my jsp page I have a javascript alert method that prints out the string returned from the server. it translates '<' to < and '>' to >
how can i avoid this?
I have tried changing my string to:
String numQsAdded = "<div id='message1'>"+getQuestion()+">/div<";
but this has even worse effects. then '&' is translated as 'amp'

XMLHttpRequest encodes the string before sending it. You will have to unescape the string.
on the client side javascript, try using:
alert(unescape(returned_string))

< is the way to show "<" in html, which is produced from XMLHttpRequest. try using XMLRequest

It is the entity reference for "<" while &gt ; is the entity reference for ">" you will need to unescape the string using the unescape() method

Paul Fisher's answer is the right one. I'll take a moment to explain why. HTML-Encoding of content from the server is a security measure to protect your users from script injection attacks. If you simply unescape() what comes from the server you could be putting your users at risk, as well as your site's reputation.
Try doing what Paul said. It's not difficult and it's much more secure. Just to make it easier, here's a sample:
var divStuff = document.createElement('div');
divStuff.appendChild(containerElement);
divStuff.id = 'message1';
divStuff.innerHTML = getQuestion();
This is much more secure and draws a better separation for you presentation layer in your application.

It might be better to send back a raw string with your message, and leave the client Javascript to create a div with class message1 to put it in. This will also help if you ever decide to change the layout or the style of your notices.

I don't think you can avoid that. It's how "<" is represented in HTML, and the result would be OK on your HTML page.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

HTML entities incorrectly encoding in API call - java

The API you are using is re-encoding the single quotes back into their hex values. In your original message string try using ' in the place of the single quotes you have.

Related

Play Framework - receiving email through SendGrid - character encoding of email body

Set [Automatically wrap text] in Java mail

How to send parameters with same encoding from javascript?

Java string encoding conversion within a webpage

why is '<' showing as <

Categories

Resources