"\n" delimiters issue

"\n" delimiters issue - java

I have a stringbuilder object, that a line of data gets added to.
after each line gets added, I append a "\n" on the end to indicate a new line.
this stringbuilder object, finalised, gets written to a flat file.
When I open the flat file in notepad I get a small rectangle after every line and the column formatting is ruined.
When I open the flat file in wordpad, the new line is taken into consideration and the column formatting is perfect.
I have tried all ways I know of removing the new line entry before it gets written, but this removes the formatting when written to the flat file. I need the new line for the formatting of the columns.
how can I output the file with new lines but without using \n?

The Windows way of terminating a line is to use "\r\n", not just "\n".
You can find the "line separator for the current operating system" using the line.separator system property:
String lineSeparator = System.getProperty("line.separator");
...
StringBuilder builder = new StringBuilder();
...
builder.append(lineSeparator);
...

You can get the value for the system your Java program is running on from the system properties
public static String newline = System.getProperty("line.separator");

You should add System.getProperty("line.separator") instead of \n. Since "nodepad", it is \r\n, for MS Windows.

In Windows you should use \n\r. In *NIX (Linux/UNIX/Mac) u should use \n

If you're using Windows, you should be writing \r\n to get it to load properly in Notepad. The \n terminator is a Unix file ending, and Notepad won't parse it properly. Wordpad will convert them for you.
Also I suggest not using Notepad, and looking towards something like Vim.

Related

Java: PrintWriter and newlines in a string

My question is pretty straight forward, if I have a single long string with alot of "\n" newlines within it, i.e:
strings = "Hey\nThere\nFriend\n"
And use a PrintWriter in Java to do the following:
PrintWriter save = new PrintWriter("test.txt");
save.println(strings);
save.close();
Will the file I end up with be formatted with the \n? i.e the file will have:
Hey
There
Friend
Or will it have:
Hey\nThere\nFriend
If it's the latter, can someone guide me on how I might change my code (and understanding of how PrintWriter works) to create the former output?

In fact, \n will work but only for Unix based OS. Windows based OS use \r\n as separator.
You should avoid using specific OS line separator if you want to write a portable code.
Favor System.lineSeparator() to not be OS dependent.
Note also that PrintWriter provides println() to achieve a break line that is not OS dependent (even if it is not necessary useful for you use case)

You will get a text file containing a single text line Hey\nThere\nFriend\n followed by your operating system new-line sequence (inserted by println()).
The meaning of \n depends on the operating system and possibly the text editor. On Linux \n usually will be interpreted as new-line sequence but on Windows the new-line sequence is \r\n so most text editors (e.g. native Notepad) will display a single HeyThereFriend line.

On windows platform \n means char(13) +Char(10) you can use
String nl = Character.toString ((char) 13)+Character.toString ((char) 10);
String strings = "Hey"+nl+"There"+nl+"Friend"+nl;
System.out.print(strings);

Unable to read any of file that contains specific character(s)

TL;DR
Why does reading in a file with – not find any data on Notepad?
Problem:
Up to this point, I have been using just plain ol' Notepad (Version 6.1) to read/write text for testing/answering questions here.
Simple bit of code to read in the text files contents, and print them to the console:
Scanner sc = new Scanner(new File("myfile.txt"));
while (sc.hasNextLine()) {
String text = sc.nextLine();
System.out.println(text);
}
All is well, the lines print as expected.
Then, if I put in this exact character: –, anywhere in the text file, it will not read any of the file, and print nothing to the console.
I can of course use Notepad++ or other (better) text editors, and there is no issue, the text, including the dash character, will print as expected.
I can also specify UTF-8, using Notepad, and it will work fine:
File fileDir = new File("myfile.txt");
BufferedReader in = new BufferedReader(
new InputStreamReader(
new FileInputStream(fileDir), "UTF8"));
String str;
while ((str = in.readLine()) != null) {
System.out.println(str);
}
On my original Notepad file, if I copy and paste the text (including the –) into Notepad++ and compare the two files with WinMerge, it tells me that the dash on Notepad is –, but on Notepad++, it is â€“.
Question:
Why, when this – is used in a text file in Notepad, it reads nothing, basically telling me that hasNextLine() is false? Should it not at least read the input until the line that contains this specific character?
Steps to reproduce:
On Windows 7, right-click and create new Text Document.
Put any text in the file (without any special characters, as such)
Put in this character anywhere in the file: –
Run the first block of code above
Output: BUILD SUCCESSFUL (total time: 1 second), i.e. doesn't print any of the text.
PS:
I know I asked a similar (well, it ended up being the same) question yesterday, but unfortunately, it seems I may not have explained myself well, or some of the viewers didn't fully read the question. Either way, I think I've explained it better here.

The issue seems to be a difference of encoding. You have to read in the same encoding that the file was written into.
Your system notepad probably uses Windows-1252(or Cp-1252) encoding. There have been problems in this encoding with a range of characters between 128 - 159. The Dash lies between this range. This range is not present in the equivalent ISO 8859-1, and is only present in the Cp1252 encoding.
Eclipse, when reading the notepad file, assumes the file to be having the encoding ISO-8859-1 (as it is equivalent). But this character is not present in ISO-8859-1, hence the problem. If you want to read from Java, you will have to specify Cp1252, and you should get your output.
This is also the reason why your code with UTF-8 works correctly, when the file in notepad is written in UTF-8.

A buffered reader reads more than the current line, maybe the text upto the problematic bytes. Charset.CharsetDecoder.onMalformedInput then comes in play, and there something restricive happens, which I would normally not have expected.
Do you use a special JDK? Do you wipe exceptions under the carpet? Like a lambda wrapping the above code. (Add catch Throwable)
Is your platfom encoding -Dfile.encoding=ISO-8859-1 instead of Cp1252.

Split text file by line, platform-independently

I wanna split a text file by line, so on Windows that would be text = new String(Files.readAllBytes(path), charset); text.split("\r\n", -1) and on UNIX it's text.split("\n", -1), and text.split(System.lineSeparator(), -1) works for both. But what if a file is created on UNIX and copied to Windows or vice versa - how do I best handle those cases? And what would that mean for the file itself - would it be broken if you tried to view it in a text editor like notepad?

Try Files.readAllLines. Alternatively Files.lines which will return you a Stream of lines.
From the javadoc of readAllLines:
This method recognizes the following as line terminators:
\u000D followed by \u000A, CARRIAGE RETURN followed by LINE FEED
\u000A, LINE FEED
\u000D, CARRIAGE RETURN
Copying from one file system to the other doesn't change the content of the file (except you are doing some "special" copying ;-) ).

If you create a file, it will use whatever line separator is native to the platform.
If you then open the file on another platform, the file does not change. If you open a unix file on windows, it doesn't gain the extra \r character.
It really depends on the editor as to how it looks, some editors handle things better than others.
As for Java, just use System.lineSeparator() if you need to specify the end of line character sequence.
As #Andreas mentioned, you can use BufferedReader.readLine() to read a file a line at a time, and it will handle the end of line character sequence in a platform independent manner.

What splitter should I use for every other line?

I have a text file that contains data every other line. I want to get the content of every non-empty line. Given the whole text of the file, I first tried using myText.split("\n\n"). To my surprise, it does not work. I'm working on Windows.

Windows uses CRLF as line separators. And you are splitting on LF. That wouldn't work.
A safe way is to use:
System.getProperty("line.separator");
to get the appropriate separator on your OS.
String newLine = System.getProperty("line.separator");
myText.split("(?:" + newLine + ")+");
It might be possible that you are reading a file created on a different OS. Then the above method won't work. A better way would be use a character class with CR and LF, as specified in comments by #Marko:
myText.split("[\r\n]+");

new line when using DataOutputStream, Android

Im trying to export some data from my database to a file. I m using the DataOutputStream because I need the method writeChars(String r).
The problem is that I cannot find a way to change the line. the "\n" leaves a space but its not changing the line. Is there any way to do it?

If you just want to write text to a file you have chosen the wrong class. DataOuputStream.writeChars always writes characters in UTF-16BE. Use BufferedWriter or PrintWriter instead. PrintWriter.println appends a platform specific line separator to the end of the line. The line separator is defined by the system property line.separator, and is not necessarily a single newline character ('\n'). E.g for Windows "\r\n", for Unix '\n' etc.

You can use a variable like String newLine = System.getProperty("line.separator");

Use this : String nl = System.getProperty("line.separator");

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

"\n" delimiters issue - java

You can get the value for the system your Java program is running on from the system properties public static String newline = System.getProperty("line.separator");

You should add System.getProperty("line.separator") instead of \n. Since "nodepad", it is \r\n, for MS Windows.

In Windows you should use \n\r. In *NIX (Linux/UNIX/Mac) u should use \n

If you're using Windows, you should be writing \r\n to get it to load properly in Notepad. The \n terminator is a Unix file ending, and Notepad won't parse it properly. Wordpad will convert them for you. Also I suggest not using Notepad, and looking towards something like Vim.

Related

Java: PrintWriter and newlines in a string

Unable to read any of file that contains specific character(s)

Split text file by line, platform-independently

What splitter should I use for every other line?

new line when using DataOutputStream, Android

Categories

Resources