Java code unexpectedly appending "&#13" to line endings - java

Our code is unexpectedly appending &#13 to the end of the lines created by the following routine:
public String getNotation(ClientMessage TransactionMessage) {
StringBuffer sb = new StringBuffer();
String lineSeparator = System.getProperty("line.separator");
String osName = System.getProperty("os.name").toLowerCase();
sb.append(getNotationTitle(TransactionMessage));
sb.append(lineSeparator);
sb.append(lineSeparator);
The "line.separator" seems to be getting translated to the string &#13 only when the code is run on a Windows Server 2008 box. It runs fine when we run the same on Windows 7 or UNIX.
Has any one encountered this issue, and if so is there any logical explanation and a solution to correct this?

HTTP (and other textual internet protocols) mandate the use of ASCII CR+LF for line break sequences - CR being the "carriage return" character (\r) and LF being the "line feed" character (\n).
This escape sequence - \r\n - is also the file separator used on Windows systems, and thus is what gets returned by your call to System.getProperty("line.separator") and then gets appended by your call to sb.append(lineSeparator) to the output string. This is happening both in your tests and when "the code is executed on the actual server" - in both instances (I presume), the code is being executed on your windows server, and thus the same string generated.
This sequence is not being translated to &#13, as you suggest. If it was, then your entire output would appear on a single line, with &#13 inserted where newlines are expected. However, it doesn't sound like that's the case - it sounds like you're getting the line breaks where you expect them but with an unexpected &#13 at the end of each line.
This makes sense when we recognize that a lone \n is sufficient to represent a line break in most programming languages and environments, and that 13 is the decimal representation of the carriage return character
I presume that your tests are displaying the strings generated in raw string form (perhaps simply by a call to println(sb.toString()), in which case the \r\n is being interpreted and displayed as you expect it to be.
I also presume that your TransactionMessage class is transmitting messages not as raw strings but rather as HTML, because &#13 would be the HTML entity code for the decimal representation of the carriage return character.
I can't tell you exactly why (at least without knowing more about your particular situation), but for some reason the LR character is being converted to its decimal representation, and your chosen method for displaying the resultant string on the client doesn't recognize that representation as a control character and therefore is displaying it as the literal &#13 immediately preceding the \n, which is being interpreted as an escape.
(unrelated side note: Since Java 7, you can use System.lineSeparator() in place of System.getProperty("line.separator"))

Related

Escape character '\' doesn't show in System.out.println() but in return value

In Java, when I replace characters in a String with escaped-characters, the characters show up in the return value, although they were not there according to System.out.println.
String[][][] proCategorization(String[] pros, String[][] preferences) {
String str = "wehnquflkwe,wefwefw,wefwefw,wefwef";
String strReplaced = str.replace(",","\",\""); //replace , with ","
System.out.println(strReplaced);
The console output is: wehnquflkwe","wefwefw","wefwefw","wefwef
String[][][] array3d = new String[1][1][1]; // initialize 3d array
array3d[0][0][0] = strReplaced;
System.out.println(array3d[0][0][0]);
return array3d;
}
The console output is:
wehnquflkwe","wefwefw","wefwefw","wefwef
Now the return value is:
[[["wehnquflkwe\",\"wefwefw\",\"wefwefw\",\"wefwef"]]]
I don't understand why the \ show up in the return value but not in the System.out.println.
Characters in memory can be represented in different ways.
Your integrated development environment (IDE) has a debugger that chooses to represent a String[][][] with a single element that contains the characters
wehnquflkwe","wefwefw","wefwefw","wefwef
as a java-quoted string
"wehnquflkwe\",\"wefwefw\",\"wefwefw\",\"wefwef"
this makes a lot of sense, because you can then copy and paste this string into java code without any loss.
On the other hand, your system's console, and the IDE's built-in terminal emulator, will output the characters in their normal representation, that is, without any java string-escape-characters:
wehnquflkwe","wefwefw","wefwefw","wefwef
As an experiment, you may want to check what happens with other "special" characters, such as \t (a tab break) or \b (backspace). This is just the tip of the iceberg - characters in Java generally translate into unicode points, which may or may not be supported by the fonts available in your system or terminal. The IDE's way of representing characters as java-quoted strings allows it to losslessly represent pretty much anything; System.out.println's output is a lot more variable.
System.out.println prints the String exactly as it is stored in memory.
On the other hand, when you stop the application flow using a breakpoint you are able to look up the values.
Most of the IDEs display escape characters with \ to indicate that it's just one String, not String[] in this case, or not to split the String into two lines if it contains \n in the middle.
Just in case, you still have doubts, I suggest printing strReplaced.length(). This should allow you to count characters one by one.
Possible experiments:
String s = "my cute \n two line String";
System.out.println(s + " length is: " + s.length());

Safe sending String argument to JavaScript function from Java

My Java project based on WebView component.
Now, I want to call some JS function with single String argument.
To do this, I'm using simple code:
webEngine.executeScript("myFunc('" + str + "');");
*str text is getting from the texarea.
This solution works, but not safe enough.
Some times we can get netscape.javascript.JSException: SyntaxError: Unexpected EOF
So, how to handle str to avoid Exception?
Letfar's answer will work in most cases, but not all, and if you're doing this for security reasons, it's not sufficient. First, backslashes need to be escaped as well. Second, the line.separator property is the server side's EOL, which will only coincidentally be the same as the client side's, and you're already escaping the two possibilities, so the second line isn't necessary.
That all being said, there's no guarantee that some other control or non-ASCII character won't give some browser problems (for example, see the current Chrome nul in a URL bug), and browsers that don't recognize JavaScript (think things like screenreaders and other accessibility tools) might try to interpret HTML special characters as well, so I normally escape [^ -~] and [\'"&<>] (those are regular expression character ranges meaning all characters not between space and tilde inclusive; and backslash, single quote, double quote, ampersand, less than, greater than). Paranoid? A bit, but if str is a user entered string (or is calculated from a user entered string), you need to be a bit paranoid to avoid a security vulnerability.
Of course the real answer is to use some open source package to do the escaping, written by someone who knows security, or to use a framework that does it for you.
I have found this quick fix:
str = str.replace("'", "\\'");
str = str.replace(System.getProperty("line.separator"), "\\n");
str = str.replace("\n", "\\n");
str = str.replace("\r", "\\n");

Are escape characters in Java platform-dependent?

I just read this question about comparing "%n" and "\n"
What's up with Java's "%n" in printf?
The answer confirms that %n can be used across platform, while \n is not. So I wonder what about other escape characters such as \t , \b, \', \", \\ .... Are they all platform-dependent just like \n?
The String escape codes mean the same thing on all platforms. They map to specified Unicode codepoints that in turn correspond to standard 7-bit ASCII control characters.
The only (theoretical) concern might be some native character set which didn't have a way of representing the equivalent of those codepoints / characters. I'm pretty sure you'd be OK on ancient 6-bit and 5-bit character sets from 50+ years ago.
However, if you are trying to output text in the platform preferred form, you do need to consider two things:
Different platforms use different character sequences as the preferred way to designate an "end of line". (Or line separator ...)
The default TAB stop positions vary between platforms. On Windows they are every 4 character positions, and Unix / Linux every 8 characters.
So when you format data for fixed-width character display (e.g. on a "console"), you need to consider these platform dependencies.
There is also some uncertainty / variability about what will "happen" when you send those characters to a display, or include them in a file. But that's not really Java's fault, or anything that Java could address.
By contrast, "%n" ... in the context of a format string ... means the platform preferred line separator. So, on a Linux/UNIX it means "\n", on Windows it means "\r" and on Macs it means "\r\n". Note that this ONLY applies to format Strings; i.e. the first argument to String.format(...), or something else that does that style of formatting.
\t \' \" and \\ will most likely act in the same way across all platforms as they represent real ASCII characters and there are not many platforms left that do not implement the full ASCII character set.
\b - well that's a different matter. That will almost certainly not do the same thing across any platforms as it is supposed to implement the BEL control code which, in itself, is not platform generic.
What were you hoping to get from your ... in the question?
Added: It seems \b is backspace - still unlikely to be cross-platform though.
Added: And as for \f - just don't use it as it will probably only ever do something that stops working when you replace your printer - if it ever actually does something at all.
Some platforms use \r\n as a new line, some other \n. Using %n will ensure the right new line emitted in the output.
That has nothing to do with the backslash character preceding characters to designate special characters like the ones you mentioned. Feel free to use it in your source code.

Is there "oldline" character in Java?

Is there exist the opposite to the newline '\n' character in Java which will move back to the previous line in the console?
ASCII doesn't standardize a "line starve" or reverse line feed control character. Some character based terminals/terminal emulators recognize control code sequences that move the cursor up a line; these aren't Java-specific, and depend on your OS and configuration. Here's a starting point if you're using Linux: http://www.kernel.org/doc/man-pages/online/pages/man4/console_codes.4.html
Java supports Unicode, which has the character "REVERSE LINE FEED" (U+008D). In Java it would be '\u008D' (as a char) or "\u008D" (as a String). Whether this would do what you want on a console, printout, or whatever, depends on the device. Java does not define any behavior for that character.

Where's definitive reference on string formatting

does anyone know of a good online resource that simply and definitevly explains how to use the string formatter method...?
I need to write a series of "records" into a set ascii text files. I need to "delimit" each "record" with a cr-lf sequence in a windows 2008 server environment.
Therefore I'm trying to figure out how to add a \r\n character string at the end of each "record". I tried a "record_string.append(CR) and LF" ; but it didn't work.
Thanks much
Guy
The documentation on the Formatter class appears to be comprehensive.
It has this to say about line separators:
Line Separator
The conversion does not correspond to any argument.
'n' - the platform-specific line separator as returned by System.getProperty("line.separator").
Flags, width, and precision are not applicable. If any are provided an IllegalFormatFlagsException, IllegalFormatWidthException, and IllegalFormatPrecisionException, respectively will be thrown.
If you specifically need to add CR LF to the end of each record (carriage return, linefeed), then you can just use exactly \r\n. The \r translates to a carriage return, and \n to linefeed. For example:
StringBuilder sb = new StringBuilder();
sb.append("some data");
// ...
sb.append("\r\n"); // add CR LF record separator
You can find the exact list of escape sequences that exist in Java in section 3.10.6 of the Java Language Specification.
Just do the:
record_string = record_string + "\n"
on widnows \n means CR-LF
Or you can use FileWriter to use writeLine(record);

Categories