How to split string with empty new line - java

my file contains this string:
a
b
c
now I want to read it and split it with empty line so I have this:
text.split("\n\n"); where text is output of file
problem is that this doesnt work. When I convert new line to byte I see that "\n\n" is represented as 10 10 but new line in my file is represented by 10 13 10 13. So how I can split my file ?

Escape Description ASCII-Value
\n New Line Feed (LF) 10
\r Carriage Return (CR) 13
So you need to try string.split("\n\r") in your case.
Edit
If you want to split by empty line, try \n\r\n\r. Or you can use .readLine() to read your file, and skip all empty lines.
Are you sure it's 10 13 10 13? It always should be 13 10...
And, you should not depend on line.separator too much. Because if you are processing some files from *nix platform, it's \n, vice versa. And even on Windows, some editors use \n as the new line character. So I suggest you to use some high level methods or use string.replaceAll("\r\n", "\n") to normalize your input.

Keep in mind, sometimes you have to use:
System.getProperty("line.separator");
to get the line separator, if you want to make it platform independent. You can also use BufferedWriter's newLine() method, that takes care of that automatically.

Try using:
text.split("\n\r");

Why are you splitting on \n\n?
You should be splitting on \r\n because that's what the file lines are separated by.

Try to use regular expressions, something like:
text.split("\\W+");
text.split("\\s+");

LF: Line Feed, U+000A
CR: Carriage Return, U+000D
so you need to try to use
"string".split("\r\n");

Use scanner object, instead of worrying about chars/bytes.

One Solution is to Split using "\n" and neglect empty Strings
List<String> lines = text.split("\n");
for(String line : lines) {
line = line.trim();
if(line != "") {
System.out.println(line);
}
}

Related

Using the split method to split a text file of music notes that also has line breaks

I am wondering how you can use the split method with this example considering the fact that that there is a line break in the file.
g3,g3,g3,c4-,a3-,g4-,r,r,r,g3,g3,g3,c4-,a3-,a4,g4-,r,r,r,c4,c4,c4,e4,r
g4,r,a4,r,r,b4b,r,a4,f4,r,g4,r,r,g4#,r,g4,d4#,r,g4
I read the Pattern api and tutorials and think it should be like so.
line.split("(,\n)");
I also tried
line.split([,\n]);
and
line.split("[,\n]");
lines may separated using \r or \n both of them, or even some other characters. Since Java 8 you can use \\R to represent line separators (more info). So you could try using
String[] arr = yourText.split(",|\\R");
As Pshemo notes, the 3rd option str.split("[,\n]") should work assuming the file ends each line with \n and not \r\n.
Additionally, how you read the file may affect your split argument.
If you are reading the file in with a BufferedReader, then going line by line with the readLine() method will automatically exclude any line-termination characters.

How to obtain text from a String variable?

String s = System.lineSeparator();
System.out.println(s);
I try to obtain text from variable s, but why is there no text in variable s ?
Try following code on your system
for(byte b : System.lineSeparator().getBytes()){
System.out.println(b);
}
It will print either
10
OR
13
10
Here I print the ascii code for whatever I got from System.lineSeparator().
ascii code for \n is 10 and for \r is 13.
It is also given in documentation of System.lineSeparator()
On UNIX systems, it returns "\n"; on Microsoft Windows systems it returns "\r\n".
So the point is you didn't see any output because if you try to print \r or \n because \r represents line feed and \n represents next line. And you cannot see them on console. But they will have their effects in strings.
The System.lineSeparator(); returns a string that the system uses to generally separate lines in, say for example, an input from the Standard Input.
This is generally a new line character and so when you print it in your program, you will not "See" it as is.
Try using this:
String s = System.lineSeparator();
System.out.println("~~" + s + "~~");
This will help you distinguish the output. You should see something like this:
~~
~~
This output would indicate the the new line character is separating the ~~ characters in your print statement.
Hope this helps!
System.lineSeperator() will mostly return "\r\n", so when you sysout it actualy prints a new line.

Java 8 programming: Reading a .ini-file and trying to get rid of newline-characters

I'm using Netbeans IDE. For a school project I need to read an .ini-file, and get some specific information.
The reason I'm not using ini4j:
I have a section that has key values which are the same
I have sections that have no key-value inputs that I have to read information from
Example ini-file:
[Section]
Object1 5 m
number = 12
Object2 6 m
;Comment followed by white line
number = 1\
4
\ means the next command or white lines need to be ignored
So the last part of the ini file actually means: number = 14
My task: I need to store the oject names with the corresponding length (meters) and number into a single string like this:
Object1 has length 1m and number 12
My problem:
I use a scanner with delimiter //Z to store the whole file into a single String.
This works (if I print out the String it gives the example above).
I've tried this code:
String file = file.replaceAll("(\\.)(\\\\)(\\n*)(\\.)","");
If I try to only remove the newlines:
String file = file.replace("\n","");
System.out.println(file);
I get an empty output.
Thanks in advance !
You are on right way. But logic is on wrong place. You actually need \n for your logic to recognize new value in your ini file.
I would suggest that you do not read entire file to the string. Why? You will still work with line from file one by one. Now you read whole file to string then split to single strings to analyze. Why not just read file with scanner line by line and analyze these lines as they come?
And when you work with individual line then simply skip empty ones. And it solves your issue.
Your problem is that you need to esacpe \ in Java Strings and in regular expressions, so you need to escape them twice. This means if you want to get rid of empty lines you have to write it like this:
file = file.replaceAll("\\n+", "\n");
If you know that a \ at the end of a line is always followed by an empty line then this means that it is actually followed by 2 new line characters which would give the following:
file = file.replaceAll("\\\\\\n\\n", "");
or (it's the same):
file = file.replaceAll("\\\\\\n{2}", "");
\\\\ will result in \\ in the regex, so it matches \ and \\n will become \n and match the new line character.
And as mentioned by #Bohemian it would be better to fix the ini-file. Standards make everything easier. If you insist you could use your own file extension, because it is actually another format.
It is also possible to write a regular expression that directly extracts you the values:
file = file.replaceAll("\\\\\\n\\n", "");
Pattern pattern = Pattern.compile("^ *([a-zA-Z0-9_]+) *= *(.+?) *$");
Matcher matcher = pattern.matcher(file);
while (matcher.find()) {
System.out.println(matcher.group(1)); // left side of = (already trimmed)
System.out.println(matcher.group(2)); // right side of = (already trimmed)
}
It's easier than reading lines one by one, but performance could be worse. Anyway usually this is not an issue because ini files tend to be small.

What splitter should I use for every other line?

I have a text file that contains data every other line. I want to get the content of every non-empty line. Given the whole text of the file, I first tried using myText.split("\n\n"). To my surprise, it does not work. I'm working on Windows.
Windows uses CRLF as line separators. And you are splitting on LF. That wouldn't work.
A safe way is to use:
System.getProperty("line.separator");
to get the appropriate separator on your OS.
String newLine = System.getProperty("line.separator");
myText.split("(?:" + newLine + ")+");
It might be possible that you are reading a file created on a different OS. Then the above method won't work. A better way would be use a character class with CR and LF, as specified in comments by #Marko:
myText.split("[\r\n]+");

new line when using DataOutputStream, Android

Im trying to export some data from my database to a file. I m using the DataOutputStream because I need the method writeChars(String r).
The problem is that I cannot find a way to change the line. the "\n" leaves a space but its not changing the line. Is there any way to do it?
If you just want to write text to a file you have chosen the wrong class. DataOuputStream.writeChars always writes characters in UTF-16BE. Use BufferedWriter or PrintWriter instead. PrintWriter.println appends a platform specific line separator to the end of the line. The line separator is defined by the system property line.separator, and is not necessarily a single newline character ('\n'). E.g for Windows "\r\n", for Unix '\n' etc.
You can use a variable like String newLine = System.getProperty("line.separator");
Use this : String nl = System.getProperty("line.separator");

Categories