How to change "\\r\\n" to line separator in java - java

I am working on a school project to build a pseudo terminal and file system. The terminal is scanning System.in and pass the string to controller.
Input to console: abc\r\nabc\r\nabc
Here is the code I tried
Scanner systemIn = Scanner(System.in);
input = systemIn.nextLine();
input = input.replaceAll("\\\\r\\\\n",System.getProperty("line.separator"));
System.out.print(input);
I want java to treat the \r\n I typed to console as a line separator, not actually \ and r.
What it does now is print the input as is.
Desired Ouput:
abc
abc
abc
UPDATE: I tried input = StringEscapeUtils.unescapeJava(input); and it solved the problem.

You need to double-escape the regexes in java (once for the regex backslash, once for the Java string). You dont want a linebreak (/\n/, "\\n"), but a backslash (/\\/) plus a "n": /\\n/, "\\\\n". So this should work:
input.replaceAll("(\\\\r)?\\\\n", System.getProperty("line.separator"));
For a more broad handling of escape sequences see How to unescape a Java string literal in Java?

If your input has the string '\r\n', try this
Scanner systemIn = Scanner(System.in);
input = systemIn.nextLine();
input = input.replaceAll("\\\\r\\\\n",System.getProperty("line.separator"))

For consistent behaviour I would replace \\r with \r and \\n with \n rather than replace \\r\\n with the newline as this will have different behaviour on different systems.
You can do
input = systemIn.nextLine().replaceAll("\\\\r", "\r").replaceAll("\\\\n", "\n");
nextLine() strips of the newline at the end. If you want to add a line separator you can do
input = systemIn.nextLine() + System.getProperty("line.separator");
if you are using println() you don't need to add it back.
System.out.println(systemIn.nextLine()); // prints a new line.

As it was mentioned by r0dney, the Bergi's solution doesn't work.
The ability to use some 3rd party libraries is good, however for a person who studies it is better to know the theory, because not for every problem exists some 3rd party library.
Overload project with tons of 3rd party libraries for tasks which can be solved in one line code makes project bulky and not easy maintainable. Anyway here is what's working:
content.replaceAll("(\\\\r)?\\\\n", System.getProperty("line.separator"));

Unless you are actually typing \ and r and \ and n into the console, you don't need to do this at all: instead you have a major misunderstanding. The CR character is represented in a String as \r but it consists of only one byte with the hex value 0xD. And if you are typing backslashes into the console, the simple answer is "don't". Just hit the Enter key: that's what it's for. It will transmit the CR byte into your code.

Related

Issues with delimiter ("\t | \n") Java

I am having issues using my delimiter in my scanner. I am currently using a scanner to read a text file and put tokens into a string. My tutor told me to use the delimiter (useDelimiter("\t|\n")). However each token that it is grabbing is ending in /r (due to a return in the text file). This is fine for printing purposes, however i need to get the string length. And instead of returning the number of actual characters, it is returning the number of characters including that /r. Is there a better delimiter I can use that will accomplish the same thing (without grabbing the /r)? code is as follows:
studentData.useDelimiter("\t|\n");
while (studentData.hasNext())
{
token = studentData.next();
int tokenLength = token.length();
statCalc(tokenLength);
}
I am well aware that I could simply remove the last character of the string token. However, for many reasons, I just want it to grab the token without the /r. Any and all help would be greatly appreciated.
Try this:
studentData.useDelimiter("\\t|\\R");
The \R pattern matches any linebreak, see documentation.
I guess the remaining \r char is a partially consumed linebreak in Windows environment. With the aforementioned delimiter, the scanner will properly consume the line.
Replace all Carriage and form return from your string.Try this
s = s.replaceAll("\\n", "");
s = s.replaceAll("\\r", "");
Windows-style line ending is usually: \r\n but you are ignoring \r as delimiter. Your regex pattern (\t|\n) can be improved by using:
(\t|\r\n|\r|\n)
However, it looks to me like what you're trying to accomplish is to create a "tokenizer" which breaks a text file into words (since you're also looking for \t) so my guess is that you're better of with:
studentData.useDelimiter("\\s*");
which will take in consideration any white-space.
You can learn more about regular expressions.

Java 8 programming: Reading a .ini-file and trying to get rid of newline-characters

I'm using Netbeans IDE. For a school project I need to read an .ini-file, and get some specific information.
The reason I'm not using ini4j:
I have a section that has key values which are the same
I have sections that have no key-value inputs that I have to read information from
Example ini-file:
[Section]
Object1 5 m
number = 12
Object2 6 m
;Comment followed by white line
number = 1\
4
\ means the next command or white lines need to be ignored
So the last part of the ini file actually means: number = 14
My task: I need to store the oject names with the corresponding length (meters) and number into a single string like this:
Object1 has length 1m and number 12
My problem:
I use a scanner with delimiter //Z to store the whole file into a single String.
This works (if I print out the String it gives the example above).
I've tried this code:
String file = file.replaceAll("(\\.)(\\\\)(\\n*)(\\.)","");
If I try to only remove the newlines:
String file = file.replace("\n","");
System.out.println(file);
I get an empty output.
Thanks in advance !
You are on right way. But logic is on wrong place. You actually need \n for your logic to recognize new value in your ini file.
I would suggest that you do not read entire file to the string. Why? You will still work with line from file one by one. Now you read whole file to string then split to single strings to analyze. Why not just read file with scanner line by line and analyze these lines as they come?
And when you work with individual line then simply skip empty ones. And it solves your issue.
Your problem is that you need to esacpe \ in Java Strings and in regular expressions, so you need to escape them twice. This means if you want to get rid of empty lines you have to write it like this:
file = file.replaceAll("\\n+", "\n");
If you know that a \ at the end of a line is always followed by an empty line then this means that it is actually followed by 2 new line characters which would give the following:
file = file.replaceAll("\\\\\\n\\n", "");
or (it's the same):
file = file.replaceAll("\\\\\\n{2}", "");
\\\\ will result in \\ in the regex, so it matches \ and \\n will become \n and match the new line character.
And as mentioned by #Bohemian it would be better to fix the ini-file. Standards make everything easier. If you insist you could use your own file extension, because it is actually another format.
It is also possible to write a regular expression that directly extracts you the values:
file = file.replaceAll("\\\\\\n\\n", "");
Pattern pattern = Pattern.compile("^ *([a-zA-Z0-9_]+) *= *(.+?) *$");
Matcher matcher = pattern.matcher(file);
while (matcher.find()) {
System.out.println(matcher.group(1)); // left side of = (already trimmed)
System.out.println(matcher.group(2)); // right side of = (already trimmed)
}
It's easier than reading lines one by one, but performance could be worse. Anyway usually this is not an issue because ini files tend to be small.

java Scanner reads only first 2048 bytes

I'm using java.util.Scanner to read file contents from classpath with this code:
String path1 = getClass().getResource("/myfile.html").getFile();
System.out.println(new File(path1).length()); // 22244 (correct)
String file1 = new Scanner(new File(path1)).useDelimiter("\\Z").next();
System.out.println(file1.length()); // 2048 (first 2k only)
Code runs from idea with command (maven test)
/Library/Java/JavaVirtualMachines/jdk1.7.0_25.jdk/Contents/Home/bin/java -Dmaven.home=/usr/share/java/maven-3.0.4 -Dclassworlds.conf=/usr/share/java/maven-3.0.4/bin/m2.conf -Didea.launcher.port=7533 "-Didea.launcher.bin.path=/Applications/IntelliJ IDEA 12 CE.app/bin" -Dfile.encoding=UTF-8 -classpath "/usr/share/java/maven-3.0.4/boot/plexus-classworlds-2.4.jar:/Applications/IntelliJ IDEA 12 CE.app/lib/idea_rt.jar" com.intellij.rt.execution.application.AppMain org.codehaus.classworlds.Launcher --fail-fast --strict-checksums test
It was running perfectly on my win7 machine. But after I moved to mac same tests fail.
I tried to google but didn't find much =(
Why Scanner with delimiter \Z read my whole file into a string on win7 but won't do it on mac?
I know there're more ways to read a file, but I like this one-liner and want to understand why it's not working.
Thanks.
Here is some info from java about it
http://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html
\Z The end of the input but for the final terminator, if any
\z The end of the input
Line terminators
A line terminator is a one- or two-character sequence that marks the
end of a line of the input character sequence. The following are
recognized as line terminators:
A newline (line feed) character ('\n'), A carriage-return character
followed immediately by a newline character ("\r\n"), A standalone
carriage-return character ('\r'), A next-line character ('\u0085'), A
line-separator character ('\u2028'), or A paragraph-separator
character ('\u2029).
So use \z instead of \Z
There is a good article about this method of entirely reading file with Scanner:
http://closingbraces.net/2011/12/17/scanner-with-z-regex/
In brief:
Because a single read with “/z” as the delimiter should read
everything until “end of input”, it’s tempting to just do a single
read and leave it at that, as the examples listed above all do.
In most cases that’s OK, but I’ve found at least one situation where
reading to “end of input” doesn’t read the entire input – when the
input is a SequenceInputStream, each of the constituent InputStreams
appears to give a separate “end of input” of its own. As a result, if
you do a single read with a delimiter of “/z” it returns the content
of the first of the SequenceInputStream’s constituent streams, but
doesn’t read into the rest of the constituent streams.
Beware of using it. It will be better to read it line-by-line, or use hasNext() checking until it will be real false.
UPD: In other words, try this code:
StringBuilder file1 = new StringBuilder();
Scanner scanner = new Scanner(new File(path1)).useDelimiter("\\Z");
while (scanner.hasNext()) {
file1.append(scanner.next());
}
I encountered this as well when using nextLine() on Mac, Java 7 update 45. Worse, after the line that is longer than 2048 bytes, the rest of the file is ignored and the Scanner thinks that it is already the end of file.
I change it to explicitly tell Scanner to use larger buffer, and it works.
Scanner sc = new Scanner(new BufferedInputStream(new FileInputStream(nf), 20*1024*1024), "utf-8");

How to create a String with carriage returns?

For a JUnit test I need a String which consists of multiple lines. But all I get is a single lined String. I tried the following:
String str = ";;;;;;\n" +
"Name, number, address;;;;;;\n" +
"01.01.12-16.02.12;;;;;;\n" +
";;;;;;\n" +
";;;;;;";
I also tried \n\r instead of \n. System.getProperty("line.separator") doesn't work too. it produces a \n in String and no carriage return. So how can I solve that?
It depends on what you mean by "multiple lines". Different operating systems use different line separators.
In Java, \r is always carriage return, and \n is line feed. On Unix, just \n is enough for a newline, whereas many programs on Windows require \r\n. You can get at the platform default newline use System.getProperty("line.separator") or use String.format("%n") as mentioned in other answers.
But really, you need to know whether you're trying to produce OS-specific newlines - for example, if this is text which is going to be transmitted as part of a specific protocol, then you should see what that protocol deems to be a newline. For example, RFC 2822 defines a line separator of \r\n and this should be used even if you're running on Unix. So it's all about context.
The fastest way I know to generate a new-line character in Java is: String.format("%n")
Of course you can put whatever you want around the %n like:
String.format("line1%nline2")
Or even if you have a lot of lines:
String.format("%s%n%s%n%s%n%s", "line1", "line2", "line3", "line4")
Try \r\n where \r is carriage return. Also ensure that your output do not have new line, because debugger can show you special characters in form of \n, \r, \t etc.
Do this:
Step 1: Your String
String str = ";;;;;;\n" +
"Name, number, address;;;;;;\n" +
"01.01.12-16.02.12;;;;;;\n" +
";;;;;;\n" +
";;;;;;";
Step 2: Just replace all "\n" with "%n" the result looks like this
String str = ";;;;;;%n" +
"Name, number, address;;;;;;%n" +
"01.01.12-16.02.12;;;;;;%n" +
";;;;;;%n" +
";;;;;;";
Notice I've just put "%n" in place of "\n"
Step 3: Now simply call format()
str=String.format(str);
That's all you have to do.
Try append characters .append('\r').append('\n'); instead of String .append("\\r\\n");
Thanks for your answers. I missed that my data is stored in a List<String> which is passed to the tested method. The mistake was that I put the string into the first element of the ArrayList. That's why I thought the String consists of just one single line, because the debugger showed me only one entry.

String.split to split data lines doesn't work correctly

I use VB.NET to create data for my game (for Android, Java code), this is how it look like:
5;0000000100011100010000000;2;2
5;1000001100010000000000000;0,1;0,1
where each line is a level. In VB.NET, I create new line by vbNewLine constant (I think its ASCII code is 13) then use IO.File.WriteAllText to write it to the file.
In my game in Java, I use \n to split the levels:
String[] levelData = rawData.split("\n");
However, when processing throught the data, the levelData always has a "new line" after the end. For example, the levelData[0] is 5;00...2;2<new line>, which cause Integer.parseInt exception. Then I debug, and found this:
rawData.charAt(31) //It's a \r, not \n
So, I change the split line:
String[] levelData = rawData.split("\r");
But now, the levelData[1] will be <newline>5....
What exactly do I have to do to solve this problem? And please explain how "new line" work in Java String.
I suppose that vbNewLine constant put both "\r\n" at the end and hence one character is left while splitting. Try to split it by using both.
Most probably it is from the code you show in VB that is the problem.
I create new line by vbNewLine constant (I think its ASCII code is 13)
First verify this for certain, then look up what code 13 is! Here is a general ascii table.
code 13 is a carrige return and is represented in Java as \r
code 10 is line feed and is represented in Java as \n
A good tip would be to read up a little about NewLines, It's completely fu**ed up, Windows and Linux uses different ways of representing a new line.
CR+LF: Microsoft Windows, DEC TOPS-10, RT-11 and most other early non-Unix and non-IBM OSes, CP/M, MP/M, DOS (MS-DOS, PC-DOS, etc.), Atari TOS, OS/2, Symbian OS, Palm OS
LF: Multics, Unix and Unix-like systems (GNU/Linux, AIX, Xenix, Mac OS X, FreeBSD, etc.), BeOS, Amiga, RISC OS, and others.
CR: Commodore 8-bit machines, Acorn BBC, TRS-80, Apple II family, Mac OS up to version 9 and OS-9
Why don't you use Scanner to read your file and split for lines instead?
Scanner sc = new Scanner(new File("levels.text"));
while (sc.hasNextLine()) {
String nextLine = sc.nextLine();
if(nextLine.lenght() > 0) { // you could even use Java regexes to validate the format of every line
String[] levelElements = nextLine.split(";");
// ...
}
}
vbNewLine is platform dependant. on windows newline is comprissed of two characters \n and \r and not just \n

Categories