How to read multiple lines using FileReader only? - java

I have the following code:
public class Reader {
public static void main(String[] args) throws IOException {
try (FileReader in = new FileReader("D:/test.txt")) {
// BufferedReader br = new BufferedReader(in);
int line = in .read();
for (int i = 0; i < line; i++) {
//System.out.println(line);
System.out.println((char) line);
line = in .read();
}
}
}
}
and a file Test.txt with the content:
Hello
Java
When I run above code it only reads Hello. I would like to read multiple lines using FileReader only. I don't want to use BufferedReader or InputStreamReader etc. Is that possible?

I don't think this version of the code prints "Hello".
You are calling:
int line = in.read();
What does this do? Look in the Javadocs for Reader:
public int read()
throws IOException
Reads a single character. This method will block until a character is available, an I/O error occurs, or the end
of the stream is reached.
(emphasis mine)
Your code reads the 'H' from 'Hello', which is 72 in ASCII.
Then it goes into your loop, with line==72, so it goes into the loop:
for(int i=0;i<line;i++)
... making the decision "is 0 less than 72? Yes, so I'll go into the loop block".
Then each time it reads a character the value of line changes to another integer, and each time loop goes around i increments. So the loop says "Keep going for as long as the ASCII value of the character is greater than the number of iterations I've counted".
... and each time it goes around, it prints that character on a line of its own.
As it happens, for your input, it reads end-of-file (-1), and as -1 < i, the loop continue condition is not met.
But for longer inputs it stop on the first 'a' after the 97th character, or the first 'b' after the 98th character, and so on (because ASCII 'a' is 97, etc.)
H
e
l
l
o
J
a
v
a
This isn't what you want:
You don't want your loop to repeat until i >= "the character I just read". You want it to repeat until in.read() returns -1. You have probably been taught how to loop until a condition is met.
You don't want to println() each character, since that adds newlines you don't want. Use print().
You should also look at the Reader.read(byte[] buffer) method, and see if you can write the code to work in bigger chunks.
Two patterns you'll use over and over again in your programming career are:
Type x = getSomehow();
while(someCondition(x)) {
doSomethingWith(x);
x = getSomehow();
}
... and ...
Type x = value_of_x_which_meets_condition;
while(someCondition(x)) {
x = getSomehow();
doSomethingWith(x);
}
See if you can construct something with FileReader and the value you get from it, filling in the "somehows".

Reading file character by character without any buffering stream is extremely ineffective. I would probably wrap FileReader in some BufferedReader or simply used Scanner to read condent of file, but if you absolutely want/need/have to use only FileReader then you can try with
int line = in.read();
while (line != -1) {
System.out.print((char) line);
line = in.read();
}
instead of your for (int i = 0; i < line; i++) {...} loop.
Read carefully slims answer. In short: reading condition shouldn't care if number of characters you read is less then numeric representation of currently read character (i < line). Like in case of
My name
is
not important now
This file has few characters which you normally will not see like \r and \n and in reality it looks like
My name\r\n
\r\n
is\r\n
\r\n
not important now
where numeric representation of \r is 10, so after you read My name\r\n (which is 9 characters because \r and \n are single character representing line separator) your i will become 10 and since next character you will try to read is \r which is also represented by 10 your condition i<line will fail (10<10 is not true).
So instead of checking i<line you should check if read value is not EoF (End of File, or End of Stream in out case) which is represented by -1 as specified in read method documentation so your condition should look like line != -1. And because you don't need i just use while loop here.
Returns:
The character read, or -1 if the end of the stream has been reached

You will have to read the content char by char and parse for a new line sequence.
A new line sequence can be any of the following:
a single cariage return '\r'
a single line feed '\n'
a carriage return followed by a line feed "\r\n"
EDIT
You could try the following:
public List<String> readLinesUsingFileReader(String filename) throws IOException {
List<String> lines = null;
try (FileReader fileReader = new FileReader(filename)) {
lines = readLines(fileReader);
}
return lines;
}
private List<String> readLines(FileReader fileReader) throws IOException {
List<String> lines = new ArrayList<>();
boolean newLine = false;
int c, p = 0;
StringBuilder line = new StringBuilder();
while(-1 != (c = fileReader.read())) {
if(c == '\n' && p != '\r') {
newLine = true;
} else if(c == '\r') {
newLine = true;
} else {
if(c != '\n' && c != '\r') {
line.append((char) c);
}
}
if(newLine) {
lines.add(line.toString());
line = new StringBuilder();
newLine = false;
}
p = c;
}
if(line.length() > 0) {
lines.add(line.toString());
}
return lines;
}
Note that the code above reads the whole file into a List, this might not be well suited for large files! You may want in such a case to implement an approach which uses streaming, i.e. read one line at a time, for example String readNextLine(FileReader fileReader) { ... }.
Some basic tests:
Create test files to read
private final static String txt0 = "testnl0.txt";
private final static String txt1 = "testnl1.txt";
private final static String txt2 = "testnl2.txt";
#BeforeClass
public static void genTestFile() throws IOException {
try (OutputStream os = new FileOutputStream(txt0)) {
os0.write((
"Hello\n" +
",\r\n" +
"World!" +
"").getBytes());
}
try (OutputStream os = new FileOutputStream(txt1)) {
os.write((
"\n" +
"\r\r" +
"\r\n" +
"").getBytes());
}
try (OutputStream os = new FileOutputStream(txt2)) {
os.write((
"").getBytes());
}
}
Test using the created files
#Test
public void readLinesUsingFileReader0() throws IOException {
List<String> lines = readLinesUsingFileReader(txt0);
Assert.assertEquals(3, lines.size());
Assert.assertEquals("Hello", lines.get(0));
Assert.assertEquals(",", lines.get(1));
Assert.assertEquals("World!", lines.get(2));
}
#Test
public void readLinesUsingFileReader1() throws IOException {
List<String> lines = readLinesUsingFileReader(txt1);
Assert.assertEquals(4, lines.size());
Assert.assertEquals("", lines.get(0));
Assert.assertEquals("", lines.get(1));
Assert.assertEquals("", lines.get(2));
Assert.assertEquals("", lines.get(3));
}
#Test
public void readLinesUsingFileReader2() throws IOException {
List<String> lines = readLinesUsingFileReader(txt2);
Assert.assertTrue(lines.isEmpty());
}

If you have the new line character
public static void main(String[]args) throws IOException{
FileReader in = new FileReader("D:/test.txt");
char [] a = new char[50];
in.read(a); // reads the content to the array
for(char c : a)
System.out.print(c); //prints the characters one by one
in.close();
}
It will print
Hello
Java

I solved the above problem by using this code
public class Reader
{
public static void main(String[]args) throws IOException{
try (FileReader in = new FileReader("D:/test.txt")) {
int line = in.read();
while(line!=-1)
{
System.out.print((char)line);
line = in.read();
} }
}
}
But there is one more question if I write for loop instead of while like this
for(int i=0;i<line;i++)
It prints only first line.Could anybody tell me why?

Reader.read() returns int code of single char or -1 if end of the file is reached:
http://docs.oracle.com/javase/7/docs/api/java/io/Reader.html#read()
So, read the file char by char and check LF (Line feed, '\n', 0x0A, 10 in decimal), CR (Carriage return, '\r', 0x0D, 13 in decimal)and end-of-line codes.
Note: Windows OS uses 2 chars to encode the end of line: "\r\n". The most of others including Linux, MacOS, etc. use only "\n" to encode the end of line.
final StringBuilder line = new StringBuilder(); // line buffer
try (FileReader in = new FileReader("D:/test.txt")) {
int chAr, prevChar = 0x0A; // chAr - just read char, prevChar - previously read char
while (prevChar != -1) { // until the last read char is EOF
chAr = in.read(); // read int code of the next char
switch (chAr) {
case 0x0D: // CR - just
break; // skip
case -1: // EOF
if (prevChar == 0x0A) {
break; // no need a new line if EOF goes right after LF
// or no any chars were read before (prevChar isn't
// changed from its initial 0x0A)
}
case 0x0A: // or LF
System.out.println("line:" + line.toString()); // get string from the line buffer
line.setLength(0); // cleanup the line buffer
break;
default: // if any other char code is read
line.append((char) chAr); // append to the line buffer
}
prevChar = chAr; // remember the current char as previous one for the next iteration
}
}

Related

Reading a File without line breaks using Buffered reader

I am reading a file with comma separated values which when split into an array will have 10 values for each line . I expected the file to have line breaks so that
line = bReader.readLine()
will give me each line. But my file doesnt have a line break. Instead after the first set of values there are lots of spaces(465 to be precise) and then the next line begins.
So my above code of readLine() is reading the entire file in one go as there are no lined breaks. Please suggest how best to efficiently tackle this scenario.
One way is to replace String with 465 spaces in your text with new line character "\n" before iterating it for reading.
I second Ninan's answer: replace the 465 spaces with a newline, then run the function you were planning on running earlier.
For aesthetics and readability I would suggest using Regex's Pattern to replace the spaces instead of a long unreadable String.replace(" ").
Your code could like below, but replace 6 with 465:
// arguments are passed using the text field below this editor
public static void main(String[] args)
{
String content = "DOG,CAT MOUSE,CHEESE";
Pattern p = Pattern.compile("[ ]{6}",
Pattern.DOTALL | Pattern.CASE_INSENSITIVE);
String newString = p.matcher(content).replaceAll("\n");
System.out.println(newString);
}
My suggestion is read file f1.txt and write to anther file f2.txt by removing all empty lines and spaces then read f2.txt something like
FileReader fr = new FileReader("f1.txt");
BufferedReader br = new BufferedReader(fr);
FileWriter fw = new FileWriter("f2.txt");
String line;
while((line = br.readLine()) != null)
{
line = line.trim(); // remove leading and trailing whitespace
if (!line.equals("")) // don't write out blank lines
{
fw.write(line, 0, line.length());
}
}
Then try using your code.
You might create your own subclass of a FilterInputStream or a PushbackInputStream and pass that to an InputStreamReader. One overrides int read().
Such a class unfortunately needs a bit of typing. (A nice excercise so to say.)
private static final int NO_CHAR = -2;
private boolean fromCache;
private int cachedSpaces;
private int cachedNonSpaceChar = NO_CHAR;
int read() throws IOException {
if (fromCache) {
if (cachecSpaces > 0) ...
if (cachedNonSpaceChar != NO_CHAR) ...
...
}
int ch = super.read();
if (ch != -1) {
...
}
return ch;
}
The idea is to cache spaces till either a nonspace char, and in read() either take from the cache, return \n instead, call super.read() when not from cache, recursive read when space.
My understanding is that you have a flat CSV file without proper line break, which supposed to have 10 values on each line.
Updated:
1. (Recommended) You can use Scanner class with useDelimiter to parse csv effectively, assuming you are trying to store 10 values from a line:
public static void parseCsvWithScanner() throws IOException {
Scanner scanner = new Scanner(new File("test.csv"));
// set your delimiter for scanner, "," for csv
scanner.useDelimiter(",");
// storing 10 values as a "line"
int LINE_LIMIT = 10;
// implement your own data structure to store each value of CSV
int[] tempLineArray = new int[LINE_LIMIT];
int lineBreakCount = 0;
while(scanner.hasNext()) {
// trim start and end spaces if there is any
String temp = scanner.next().trim();
tempLineArray[lineBreakCount++] = Integer.parseInt(temp);
if (lineBreakCount == LINE_LIMIT) {
// replace your own logic for handling the full array
for(int i=0; i<tempLineArray.length; i++) {
System.out.print(tempLineArray[i]);
} // end replace
// resetting array and counter
tempLineArray = new int[LINE_LIMIT];
lineBreakCount = 0;
}
}
scanner.close();
}
Or use the BufferedReader.
You might not need the ArrayList to store all values if there is memory issue by replacing your own logic.
public static void parseCsv() throws IOException {
BufferedReader br = new BufferedReader(new FileReader(file));
// your delimiter
char TOKEN = ',';
// your requirement of storing 10 values for each "line"
int LINE_LIMIT = 10;
// tmp for storing from BufferedReader.read()
int tmp;
// a counter for line break
int lineBreakCount = 0;
// array for storing 10 values, assuming the values of CSV are integers
int[] tempArray = new int[LINE_LIMIT];
// storing tempArray of each line to ArrayList
ArrayList<int[]> lineList = new ArrayList<>();
StringBuilder sb = new StringBuilder();
while((tmp = br.read()) != -1) {
if ((char)tmp == TOKEN) {
if (lineBreakCount == LINE_LIMIT) {
// your logic to handle the current "line" here.
lineList.add(tempArray);
// new "line"
tempArray = new int[LINE_LIMIT];
lineBreakCount = 0;
}
// storing current value from buffer with trim of spaces
tempArray[lineBreakCount] =
Integer.parseInt(sb.toString().trim());
lineBreakCount++;
// clear the buffer
sb.delete(0, sb.length());
}
else {
// add current char from BufferedReader if not delimiter
sb.append((char)tmp);
}
}
br.close();
}

Read a text file until EOL in Java

I am trying to read a text file which has -
hello James!
How are you today!
I want to read the each character in the string till i find EOL character.As i am using windows where i have /n/r which represents EOL character.How can i write a condition to go through all the characters of the string and print them one by one till it reaches EOL(/n/r).
int readedValue;
do
{
while((readedValue = bufferReader.read()) != 10)
{
//readedValue = bufferReader.read();
char ch = (char) readedValue;
System.out.print(ch);
}
}
while ((readedValue = bufferReader.read()) != -1);
when i read the file now , i get out put as hello James!ow are you today!
I am not getting 'H'ow in How. What can i alter this to get the complete text?
As people have noted, the readline() method reads to the next line separator, and returns the line with the separator removed. So your tests for '\n' and '\r' in line cannot possibly evaluate to true.
But you can easily add an extra end-of-line when you output the line string1.
1 - that is, unless you actually need to preserve the exact same end-of-line sequence characters as in the input stream.
You ask:
Instead of using readline(), Is there any way i can use buffer reader to read each character and print them?
Yea, sure. The read() method returns either one character or -1 to indicate EOF. So:
int ch = br.read();
while (ch != -1) {
System.out.print((char) ch);
ch = br.read();
}
You can use something like that:
while((line=input.readLine())!=null) {
// do something
}
If you want to read char by char, you can use this:
int readedValue;
while ((readedValue = reader.read()) != -1) {
char ch = (char) readedValue;
// do something
}
Here is an example (with a string instead a file) for your new problem:
String line;
int readedValue;
String s = "hello James!\n\rHow are you today!";
StringReader input = new StringReader(s);
BufferedReader lineReader= new BufferedReader (input);
while((line=lineReader.readLine())!=null) {
StringReader input2 = new StringReader(line);
BufferedReader charReader= new BufferedReader (input2);
while((readedValue = charReader.read()) != -1) {
char ch = (char) readedValue;
System.out.print(ch);
}
}
Your problem is about magic numbers.
your while will enter into an infinite loop in the case charAt(21)!='\n' && charAt(22)!='\r'
those two integers shall be increased inside the loop.
charAt(i)!='\n' && charAt(i+1)!='\r'
::inside loop
i++

BufferedReader readLine() issue: detecting end of file and empty return lines

I want my program to do something when it finds the end of a file (EOF) at the end of the last line of text, and something else when the EOF is at the empty line AFTER that last line of text. Unfortunately, BufferedReader seems to consider both cases equal.
For example, this is my code to read the lines to the end of the file:
FileReader fr = new FileReader("file.txt");
BufferedReader br = new BufferedReader(fr);
String line;
while((line = br.readLine()) != null) {
if (line.equals("")) {
System.out.println("Found an empty line at end of file.");
}
}
If file.txt contained this, it wouldn't print:
line of text 1
another line 2//cursor
This wouldn't print either:
line of text 1
another line 2
//cursor
However, this will:
line of text 1
another line 2
//cursor
What reader can I use to differentiate the first two cases?
You can use BufferedReader.read(char[] cbuf, int off, int len) method. When end of file is reached, return value -1, you can check if the last buffer read ended with a line separator.
Admittedly, the code would be more complicated as it will have to manage the construction of lines from the read char[] buffers.
You'll have to use read rather than readLine and handle end-of-line detection yourself. readLine considers \n, \r, and EOF all to be line terminators, and doesn't include the terminator in what it returns, so you can't differentiate on the basis of the returned string.
public ArrayList<String> readFile(String inputFilename) throws IOException {
BufferedReader br = new BufferedReader(new FileReader(inputFilename));
ArrayList<String> lines = new ArrayList<>();
String currentLine = "";
int currentCharacter = br.read();
int lastCharacter = -1;
// Loop through each character read.
while (currentCharacter != -1) {
// Skip carriage returns.
if (currentCharacter != '\r') {
// Add the currentLine at each line feed and then reset currentLine.
if (currentCharacter == '\n') {
lines.add(currentLine);
currentLine = "";
} else {
// Add each non-line-separating character to the currentLine.
currentLine += (char) currentCharacter;
}
}
// Keep track of the last read character and continue reading the next
// character.
lastCharacter = currentCharacter;
currentCharacter = br.read();
}
br.close();
// If the currentLine is not empty, add it to the end of the ArrayList.
if (!currentLine.isEmpty()) {
lines.add(currentLine);
}
// If the last read character was a line feed, add another String to the end
// of the ArrayList.
if (lastCharacter == '\n') {
lines.add("");
}
return lines;
}
I tried reading from a BufferedReader that received its input from a socket input stream.
Everything worked fine until the last line, where the readLine() would just simply hang because the browser wouldn't send a newline terminator on post data.
This is my solution, to be able to read until the end of the input stream.
public String getLine(BufferedReader in)
{
StringBuilder builder = new StringBuilder();
try {
while(in.ready()) {
char input = (char)in.read();
/**
* This method only matches on " \r\n" as a new line indicator.
* change as needed for your own line terminators
*/
if(input == '\r') {
/** If we can read more, read one more character
* If that's a newline, we break and return.
* if not, we add the carriage return and let the
* normal program flow handle the read character
*/
if(in.ready()) {
input = (char)in.read();
if(input == '\n') {
break;
}
else {
builder.append('\r');
}
}
}
builder.append(input);
}
}
catch(IOException ex) {
System.out.println(ex.getMessage());
}
return builder.toString();
}
You can use #hmjd's solution or any other readers that can read byte by byte.
If you want to stick with reading line by line, you can use this.
boolean EOF = (currentLine = bufferedReader.readLine()) == null;
while(!EOF){
// do things that will happen no matter it is EOF or not
EOF = (currentLine = bufferedReader.readLine()) == null;
if(!EOF){
// do things that will happen no matter it is not EOF
}else{
// do things that will happen no matter it is EOF
}
}
}
Why not use
if (line.length()==0) {
System.out.println("Found an empty line.");
}
Note: this will detect a blank line anywhere in the file, not just at EOF.

How to safely read a text file that might be binary?

We have some Java code that processes a user-provided file by looping through the file using BufferedReader.readline() to read in each line.
The problem is that when the user uploads a file that has extremely long lines, like an arbitrary binary JPG or such, this can cause out-of-memory issues. Even the first readline() may not return. We want to reject the files with long lines before it OOMs.
Is there a standard Java idiom to handle this, or do we just change to read() and write our own safe version of readLine()?
You will need to read the file character by character (or chunk by chunk) yourself (via some form of read()), and then form the lines into Strings when you encounter a newline character. This way you can throw an Exception (avoiding the OOM error) if some maximum number of characters is hit before a newline is encountered.
If you use a Reader instance it should not be too difficult to implement this code, just read from the Reader into a buffer (which you allocate to your maximum possible line length), and then convert the buffer to String when you encounter a newline (or throw an exception if you don't).
There doesn't appear to be any way to set a line length limit for BufferedReader.readLine(), so it will accumulate the entire line before feeding it to your code, however long that line may be.
Therefore, you'll have to do the line-splitting part yourself, and give up once a line is too long.
You might use the following as a starting point:
class LineTooLongException extends Exception {}
class ShortLineReader implements AutoCloseable {
final Reader reader;
final char[] buf = new char[8192];
int nextIndex = 0;
int maxIndex = 0;
boolean eof;
public ShortLineReader(Reader reader) {
this.reader = reader;
}
public String readLine() throws IOException, LineTooLongException {
if (eof) {
return null;
}
for (;;) {
for (int i = nextIndex; i < maxIndex; i++) {
if (buf[i] == '\n') {
String result = new String(buf, nextIndex, i - nextIndex);
nextIndex = i + 1;
return result;
}
}
if (maxIndex - nextIndex > 6000) {
throw new LineTooLongException();
}
System.arraycopy(buf, nextIndex, buf, 0, maxIndex - nextIndex);
maxIndex -= nextIndex;
nextIndex = 0;
int c = reader.read(buf, maxIndex, buf.length - maxIndex);
if (c == -1) {
eof = true;
return new String(buf, nextIndex, maxIndex - nextIndex);
} else {
maxIndex += c;
}
}
}
#Override
public void close() throws Exception {
reader.close();
}
}
public class Test {
public static void main(String[] args) throws Exception {
File file = new File("D:\\t\\output.log");
// try (OutputStream fos = new BufferedOutputStream(new FileOutputStream(file))) {
// for (int i = 0; i < 10000000; i++) {
// fos.write(65);
// }
// }
try (ShortLineReader r = new ShortLineReader(new FileReader(file))) {
String s;
while ((s = r.readLine()) != null) {
System.out.println(s);
}
}
}
}
Note: This assumes unix-style line termination.
Use BufferedInputStream to read binary data rather than BufferedReader...
for example if it is an image file, using ImageIO and InputStream you can do it like this..
File file = new File("image.gif");
image = ImageIO.read(file);
InputStream is = new BufferedInputStream(new FileInputStream("image.gif"));
image = ImageIO.read(is);
hope it helps...
There doesn't appear to be a definite way but a few things you can do:
Check file headers. jMimeMagic seems to be a pretty good library for this purpose.
Check the type of characters the file contains. Essentially do statistical analysis on the first 'x' bytes of the file and use that to estimate the rest of the content.
Check for newlines '\n' or '\r' in the files, binary files usually wont contain newlines.
Hope that helps.

How to find out which line separator BufferedReader#readLine() used to split the line?

I am reading a file via the BufferedReader
String filename = ...
br = new BufferedReader( new FileInputStream(filename));
while (true) {
String s = br.readLine();
if (s == null) break;
...
}
I need to know if the lines are separated by '\n' or '\r\n'
is there way I can find out ?
I don't want to open the FileInputStream so to scan it initially.
Ideally I would like to ask the BufferedReader since it must know.
I am happy to override the BufferedReader to hack it but I really don't want to open the filestream twice.
Thanks,
Note: the current line separator (returned by System.getProperty("line.separator") ) can not be used as the file could have been written by another app on another operating system.
To be in phase with the BufferedReader class, you may use the following method that handles \n, \r, \n\r and \r\n end line separators:
public static String retrieveLineSeparator(File file) throws IOException {
char current;
String lineSeparator = "";
FileInputStream fis = new FileInputStream(file);
try {
while (fis.available() > 0) {
current = (char) fis.read();
if ((current == '\n') || (current == '\r')) {
lineSeparator += current;
if (fis.available() > 0) {
char next = (char) fis.read();
if ((next != current)
&& ((next == '\r') || (next == '\n'))) {
lineSeparator += next;
}
}
return lineSeparator;
}
}
} finally {
if (fis!=null) {
fis.close();
}
}
return null;
}
After reading the java docs (I confess to being a pythonista), it seems that there isn't a clean way to determine the line-end encoding used in a specific file.
The best thing I can recommended is that you use BufferedReader.read() and iterate over every character in the file. Something like this:
String filename = ...
br = new BufferedReader( new FileInputStream(filename));
while (true) {
String l = "";
Char c = " ";
while (true){
c = br.read();
if not c == "\n"{
// do stuff, not sure what you want with the endl encoding
// break to return endl-free line
}
if not c == "\r"{
// do stuff, not sure what you want with the endl encoding
// break to return endl-free line
Char ctwo = ' '
ctwo = br.read();
if ctwo == "\n"{
// do extra stuff since you know that you've got a \r\n
}
}
else{
l = l + c;
}
if (l == null) break;
...
l = "";
}
BufferedReader.readLine() does not provide any means of determining what the line break was. If you need to know, you'll need to read characters in yourself and find line breaks yourself.
You may be interested in the internal LineBuffer class from Guava (as well as the public LineReader class it's used in). LineBuffer provides a callback method void handleLine(String line, String end) where end is the line break characters. You could probably base something to do what you want on that. An API might look something like public Line readLine() where Line is an object that contains both the line text and the line end.
BufferedReader does not accept FileInputStreams
No, you cannot find out the line terminator character that was used in the file being read by BufferedReader. That information is lost while reading the file.
Unfornunately all answers below are incorrect.
Edit: And yes you can always extend BufferedReader to include the additional functionality you desire.
The answer would be You can't find out what was the line ending.
I am looking for what can cause line endings in the same funcion. After looking at the BufferedReader source code, I can saz that BufferedReader.readLine ends line on '\r' or '\n' and skips leftower '\r' or '\n'. Hardcoded, does not care about settings.
If you happen to be reading this file into a Swing text component then you can just use the JTextComponent.read(...) method to load the file into the Document. Then you can use:
textComponent.getDocument().getProperty( DefaultEditorKit.EndOfLineStringProperty );
to get actual EOL string that was used in the file.
Maybe you could use Scanner instead.
You can pass regular expressions to Scanner#useDelimiter() to set custom delimiter.
String regex="(\r)?\n";
String filename=....;
Scanner scan = new Scanner(new FileInputStream(filename));
scan.useDelimiter(Pattern.compile(regex));
while (scan.hasNext()) {
String str= scan.next();
// todo
}
You could use this code below to convert BufferedReader to Scanner
new Scanner(bufferedReader);
Not sure if useful, but sometimes I need to find out the line delimiter after I've read the file already far-down the road.
In this case I use this code:
/**
* <h1> Identify which line delimiter is used in a string </h1>
*
* This is useful when processing files that were created on different operating systems.
*
* #param str - the string with the mystery line delimiter.
* #return the line delimiter for windows, {#code \r\n}, <br>
* unix/linux {#code \n} or legacy mac {#code \r} <br>
* if none can be identified, it falls back to unix {#code \n}
*/
public static String identifyLineDelimiter(String str) {
if (str.matches("(?s).*(\\r\\n).*")) { //Windows //$NON-NLS-1$
return "\r\n"; //$NON-NLS-1$
} else if (str.matches("(?s).*(\\n).*")) { //Unix/Linux //$NON-NLS-1$
return "\n"; //$NON-NLS-1$
} else if (str.matches("(?s).*(\\r).*")) { //Legacy mac os 9. Newer OS X use \n //$NON-NLS-1$
return "\r"; //$NON-NLS-1$
} else {
return "\n"; //fallback onto '\n' if nothing matches. //$NON-NLS-1$
}
}
If you are using groovy, you can simply do:
def lineSeparator = new File('path/to/file').text.contains('\r\n') ? '\r\n' : '\n'

Categories