BufferedReader messed up by different line seperators - java

I'm having a buffered reader streaming a file. There are two cases right now:
It is streaming a file generated on one PC, let's call it File1.
It is streaming a file generated on another Computer, let's call it File2.
I'm assuming my problem is caused by the EOLs.
BufferedReader does read both files, but for the File2, it reads an extra empty line for every new line.
Also, when I compare the line using line.equalsIgnoreCase("abc"), given that the line is "abc" it does not return true.
Use this code together with the two files provided in the two links to replicate the problem:
public class JavaApplication {
/**
* #param args the command line arguments
*/
public static void main(String[] args) throws IOException {
File file = new File("C:/Users/User/Downloads/html (2).htm");
BufferedReader in = new BufferedReader(new InputStreamReader(new FileInputStream(file), "UTF-8"));
String line = "";
while ((line = in.readLine()) != null) {
System.out.println(line);
}
}
File1,
File2
Note how the second file prints an empty line after each line...
I've been searching and trying and searching and trying, and couldn't come up with a solution.
Any ideas how to fix that? (Especially the compare thing?)

Works for me.
public class CRTest
{
static StringReader test = new StringReader( "Line 1\rLine 2\rLine 3\r" );
public static void main(String[] args) throws IOException {
BufferedReader buf = new BufferedReader( test );
for( String line = null; (line = buf.readLine()) != null; )
System.out.println( line );
}
}
Prints:
run:
Line 1
Line 2
Line 3
BUILD SUCCESSFUL (total time: 1 second)
As Joop said, I think you've mixed up which file isn't working. Please use the above skeleton to create an MCVE and show us exactly what file input isn't working for you.
Since you appear to have a file with reversed \r\n lines, here's my first attempt at a fix. Please test it, I haven't tried it yet. You need to wrap your InputStreamReader with this class, then wrap the BufferedReader on the outside like normal.
class CRFix extends Reader
{
private final Reader reader;
private boolean readNL = false;
public CRFix( Reader reader ) {
this.reader = reader;
}
#Override
public int read( char[] cbuf, int off, int len )
throws IOException
{
for( int i = off; i < off+len; i++ ) {
int c = reader.read();
if( c == -1 )
if( i == off ) return -1;
else return i-off-1;
if( c == '\r' && readNL ) {
readNL = false;
c = reader.read();
}
if( c == '\n' )
readNL = true;
else
readNL = false;
cbuf[i] = (char)c;
}
return len;
}
#Override
public void close()
throws IOException
{
reader.close();
}
}

Joop was right, after some more research it seems like, even though both files have specified a UTF-16 encoding in their header, one was encoded in UTF-16, and the other (File1) in UTF-8. This lead to the "double line effect".
Thanks for the effort that was put in answering this question.

Related

Saving huge file in a string JAVA

i'm trying to read a FASTA file into a string in java.
My code works fine with small files, but when I choose a real FASTA file
which includes 5 million chars, so I can use this string, the program get stucked. get stucked= i see no output, and the program becomes with black screen.
public static String ReadFastaFile(File file) throws IOException{
String seq="";
try(Scanner scanner = new Scanner(new File(file.getPath()))) {
while ( scanner.hasNextLine() ) {
String line = scanner.nextLine();
seq+=line;
// process line here.
}
}
return seq;
}
Try to use a StringBuilder to process big loads of text data:
public static String ReadFastaFile( File file ) throws IOException {
StringBuilder seq = new StringBuilder();
try( Scanner scanner = new Scanner( file ) ) {
while ( scanner.hasNextLine() ) {
String line = scanner.nextLine();
seq.append( line );
// process line here.
}
}
return seq.toString();
}
I would try to use BufferedReader to read the file, something like this:
public static String readFastaFile(File file) throws IOException {
String seq="";
try(BufferedReader br = new BufferedReader(new FileReader(file))) {
String line;
while ((line = br.readLine()) != null) {
// process line here.
}
}
return seq;
}
And also concatenate with StringBuilder like davidbuzatto said.

How can I read lines from a inputted file and then store the most recently read lines in an array?

I am trying to create a program that takes an inputted text file and reads the lines one by one. It then needs to store the most recently read lines (the number of lines depends on the parameter lines) in an array and then I need to print the lines using PrintWriter.
I started the first part but I'm not sure if I have the right idea. If anyone can help me on the second part as well that would be very appreciated!
public void RecentLines(Reader in, Writer out, int lines) throws IOException {
BufferedReader r3ader = new BufferedReader(in);
String str;
while((str = r3ader.readLine()) != null){
String[] arr = str.split(" ");
for( int i =0; i < lines; i++){
arr[i] = r3ader.readLine();
}
}
EDIT
the full question is this:
Create a program which reads lines from IN, one line at the time until the end. Your method must maintain an internal buffer that stores the most recently read lines (this might be best done using an array). Once the method reaches the end of the file, it should print the lines stored in the internal buffer into out, probably best done by creating a PrintWriter to decorate this Writer. (Except for your debugging purposes during the development stage, this method should not print anything to System.out.)
Try this one:
public void RecentLines(Reader in, Writer out, int lines) throws IOException {
BufferedReader r3ader = new BufferedReader(in);
String str;
int i=0;
String[] lineArray = new String[lines];
while((str = r3ader.readLine()) != null){
lines[i%lines] = str;
i++;
if(!r3ader.hasNextLine()){
break;
}
}
sounds like a task for data structures. Queue seems to be the best fit for a given task.
public void RecentLines(Reader in, Writer out, int lines) throws IOException {
BufferedReader r3ader = new BufferedReader(in);
BufferedWriter wout = new BufferedWriter(out);
String str;
Queue<String> content = new LinkedList<String>();
int i = 0;
while ((str = r3ader.readLine()) != null) {
if (i >= lines) {
content.remove();
}
content.add(str);
i++;
}
wout.write(String.valueOf(content));
}

Reading a specific set of lines in a file [duplicate]

In Java, is there any method to read a particular line from a file? For example, read line 32 or any other line number.
For small files:
String line32 = Files.readAllLines(Paths.get("file.txt")).get(32)
For large files:
try (Stream<String> lines = Files.lines(Paths.get("file.txt"))) {
line32 = lines.skip(31).findFirst().get();
}
Unless you have previous knowledge about the lines in the file, there's no way to directly access the 32nd line without reading the 31 previous lines.
That's true for all languages and all modern file systems.
So effectively you'll simply read lines until you've found the 32nd one.
Not that I know of, but what you could do is loop through the first 31 lines doing nothing using the readline() function of BufferedReader
FileInputStream fs= new FileInputStream("someFile.txt");
BufferedReader br = new BufferedReader(new InputStreamReader(fs));
for(int i = 0; i < 31; ++i)
br.readLine();
String lineIWant = br.readLine();
Joachim is right on, of course, and an alternate implementation to Chris' (for small files only because it loads the entire file) might be to use commons-io from Apache (though arguably you might not want to introduce a new dependency just for this, if you find it useful for other stuff too though, it could make sense).
For example:
String line32 = (String) FileUtils.readLines(file).get(31);
http://commons.apache.org/io/api-release/org/apache/commons/io/FileUtils.html#readLines(java.io.File, java.lang.String)
You may try indexed-file-reader (Apache License 2.0). The class IndexedFileReader has a method called readLines(int from, int to) which returns a SortedMap whose key is the line number and the value is the line that was read.
Example:
File file = new File("src/test/resources/file.txt");
reader = new IndexedFileReader(file);
lines = reader.readLines(6, 10);
assertNotNull("Null result.", lines);
assertEquals("Incorrect length.", 5, lines.size());
assertTrue("Incorrect value.", lines.get(6).startsWith("[6]"));
assertTrue("Incorrect value.", lines.get(7).startsWith("[7]"));
assertTrue("Incorrect value.", lines.get(8).startsWith("[8]"));
assertTrue("Incorrect value.", lines.get(9).startsWith("[9]"));
assertTrue("Incorrect value.", lines.get(10).startsWith("[10]"));
The above example reads a text file composed of 50 lines in the following format:
[1] The quick brown fox jumped over the lazy dog ODD
[2] The quick brown fox jumped over the lazy dog EVEN
Disclamer: I wrote this library
Although as said in other answers, it is not possible to get to the exact line without knowing the offset (pointer) before. So, I've achieved this by creating an temporary index file which would store the offset values of every line. If the file is small enough, you could just store the indexes (offset) in memory without needing a separate file for it.
The offsets can be calculated by using the RandomAccessFile
RandomAccessFile raf = new RandomAccessFile("myFile.txt","r");
//above 'r' means open in read only mode
ArrayList<Integer> arrayList = new ArrayList<Integer>();
String cur_line = "";
while((cur_line=raf.readLine())!=null)
{
arrayList.add(raf.getFilePointer());
}
//Print the 32 line
//Seeks the file to the particular location from where our '32' line starts
raf.seek(raf.seek(arrayList.get(31));
System.out.println(raf.readLine());
raf.close();
Also visit the Java docs on RandomAccessFile for more information:
Complexity: This is O(n) as it reads the entire file once. Please be aware for the memory requirements. If it's too big to be in memory, then make a temporary file that stores the offsets instead of ArrayList as shown above.
Note: If all you want in '32' line, you just have to call the readLine() also available through other classes '32' times. The above approach is useful if you want to get the a specific line (based on line number of course) multiple times.
Another way.
try (BufferedReader reader = Files.newBufferedReader(
Paths.get("file.txt"), StandardCharsets.UTF_8)) {
List<String> line = reader.lines()
.skip(31)
.limit(1)
.collect(Collectors.toList());
line.stream().forEach(System.out::println);
}
No, unless in that file format the line lengths are pre-determined (e.g. all lines with a fixed length), you'll have to iterate line by line to count them.
In Java 8,
For small files:
String line = Files.readAllLines(Paths.get("file.txt")).get(n);
For large files:
String line;
try (Stream<String> lines = Files.lines(Paths.get("file.txt"))) {
line = lines.skip(n).findFirst().get();
}
In Java 7
String line;
try (BufferedReader br = new BufferedReader(new FileReader("file.txt"))) {
for (int i = 0; i < n; i++)
br.readLine();
line = br.readLine();
}
Source: Reading nth line from file
If you are talking about a text file, then there is really no way to do this without reading all the lines that precede it - After all, lines are determined by the presence of a newline, so it has to be read.
Use a stream that supports readline, and just read the first X-1 lines and dump the results, then process the next one.
It works for me:
I have combined the answer of
Reading a simple text file
But instead of return a String I am returning a LinkedList of Strings. Then I can select the line that I want.
public static LinkedList<String> readFromAssets(Context context, String filename) throws IOException {
BufferedReader reader = new BufferedReader(new InputStreamReader(context.getAssets().open(filename)));
LinkedList<String>linkedList = new LinkedList<>();
// do reading, usually loop until end of file reading
StringBuilder sb = new StringBuilder();
String mLine = reader.readLine();
while (mLine != null) {
linkedList.add(mLine);
sb.append(mLine); // process line
mLine = reader.readLine();
}
reader.close();
return linkedList;
}
Use this code:
import java.nio.file.Files;
import java.nio.file.Paths;
public class FileWork
{
public static void main(String[] args) throws IOException {
String line = Files.readAllLines(Paths.get("D:/abc.txt")).get(1);
System.out.println(line);
}
}
You can use LineNumberReader instead of BufferedReader. Go through the api. You can find setLineNumber and getLineNumber methods.
You can also take a look at LineNumberReader, subclass of BufferedReader. Along with the readline method, it also has setter/getter methods to access line number. Very useful to keep track of the number of lines read, while reading data from file.
public String readLine(int line){
FileReader tempFileReader = null;
BufferedReader tempBufferedReader = null;
try { tempFileReader = new FileReader(textFile);
tempBufferedReader = new BufferedReader(tempFileReader);
} catch (Exception e) { }
String returnStr = "ERROR";
for(int i = 0; i < line - 1; i++){
try { tempBufferedReader.readLine(); } catch (Exception e) { }
}
try { returnStr = tempBufferedReader.readLine(); } catch (Exception e) { }
return returnStr;
}
you can use the skip() function to skip the lines from begining.
public static void readFile(String filePath, long lineNum) {
List<String> list = new ArrayList<>();
long totalLines, startLine = 0;
try (Stream<String> lines = Files.lines(Paths.get(filePath))) {
totalLines = Files.lines(Paths.get(filePath)).count();
startLine = totalLines - lineNum;
// Stream<String> line32 = lines.skip(((startLine)+1));
list = lines.skip(startLine).collect(Collectors.toList());
// lines.forEach(list::add);
} catch (IOException e1) {
// TODO Auto-generated catch block
e1.printStackTrace();
}
list.forEach(System.out::println);
}
EASY WAY - Reading a line using line number.
Let's say Line number starts from 1 till null .
public class TextFileAssignmentOct {
private void readData(int rowNum, BufferedReader br) throws IOException {
int n=1; //Line number starts from 1
String row;
while((row=br.readLine()) != null) { // Reads every line
if (n == rowNum) { // When Line number matches with which you want to read
System.out.println(row);
}
n++; //This increments Line number
}
}
public static void main(String[] args) throws IOException {
File f = new File("../JavaPractice/FileRead.txt");
FileReader fr = new FileReader(f);
BufferedReader br = new BufferedReader(fr);
TextFileAssignmentOct txf = new TextFileAssignmentOct();
txf.readData(4, br); //Read a Specific Line using Line number and Passing buffered reader
}
}
for a text file you can use an integer with a loop to help you get the number of the line, don't forget to import the classes we are using in this example
File myObj = new File("C:\\Users\\LENOVO\\Desktop\\test.txt");//path of the file
FileReader fr = new FileReader(myObj);
fr.read();
BufferedReader bf = new BufferedReader(fr); //BufferedReader of the FileReader fr
String line = bf.readLine();
int lineNumber = 0;
while (line != null) {
lineNumber = lineNumber + 1;
if(lineNumber == 7)
{
//show line
System.out.println("line: " + lineNumber + " has :" + line);
break;
}
//lecture de la prochaine ligne, reading next
line = bf.readLine();
}
They are all wrong I just wrote this in about 10 seconds.
With this I managed to just call the object.getQuestion("linenumber") in the main method to return whatever line I want.
public class Questions {
File file = new File("Question2Files/triviagame1.txt");
public Questions() {
}
public String getQuestion(int numLine) throws IOException {
BufferedReader br = new BufferedReader(new FileReader(file));
String line = "";
for(int i = 0; i < numLine; i++) {
line = br.readLine();
}
return line; }}

Java - reading file as binary with readLine

I have a Ruby code that reads file line-by-line and checks if it needs to read the next line to some block or it should handle that block and continue reading file parsing each line.
Here's it:
File.open(ARGV[0], 'rb') do |f|
fl = false
text = ''
f.readlines.each do |line|
if (line =~ /^end_block/)
fl = false
# parse text variable
end
text += line if fl == true
if (line =~ /^start_block/)
fl = true
end
end
end
E.g. i need the file to be opened for reading as binary and still i need a readLine method.
So, the question is: how can i do exactly the same with Groovy/Java?
You can use java.io.DataInputStream which provides both a readLine() method and readFully(byte[]) and read(byte[]) methods.
Warning: The JavaDoc for readLine says, it is deprecated and that the encoding might be inappropriate (read details in JavaDoc).
So think twice about your real requirements and if this is a suitable trade-off in your case.
If you have line formatted text, that's not binary IMHO. That's because true binary can have any byte, even new line and carriage return which would create false breaks in the code.
What you could mean is you have text where you want to read each byte without encoding or possibly mangling them. This is the same as using ISO-8859-1.
You can try
BufferedReader br = new BufferedReader(new InputStreamReader(
new FileInputStream(filename), "ISO-8859-1"));
StringBuilder sb = new StringBuilder();
String line;
boolean include = false;
while((line = br.readLine()) != null) {
if (line.startsWith("end_block"))
include = false;
else if (line.startsWith("start_block"))
include = true;
else if (include)
sb.append(line).append('\n'); // new lines back in.
}
br.close();
String text = sb.toString();
Maybe something like this:
public final class Read
{
private static final Pattern START_BLOCK = Pattern.compile("whatever");
private static final Pattern END_BLOCK = Pattern.compile("whatever");
public static void main(final String... args)
throws IOException
{
if (args.length < 1) {
System.err.println("Not enough arguments");
System.exit(1);
}
final FileReader r = new FileReader(args[0]);
final BufferedReader reader = new BufferedReader(r);
final StringBuilder sb = new StringBuilder();
boolean inBlock = false;
String line;
while ((line = reader.readLine()) != null) {
if (END_BLOCK.matcher(line).matches()) {
inBlock = false;
continue;
}
if (inBlock)
sb.append(line);
if (START_BLOCK.matcher(line).matches())
inBlock = true;
}
System.out.println(sb.toString());
System.exit(0);
}
}

How to see if a Reader is at EOF?

My code needs to read in all of a file. Currently I'm using the following code:
BufferedReader r = new BufferedReader(new FileReader(myFile));
while (r.ready()) {
String s = r.readLine();
// do something with s
}
r.close();
If the file is currently empty, though, then s is null, which is no good. Is there any Reader that has an atEOF() method or equivalent?
The docs say:
public int read() throws IOException
Returns:
The character read, as an integer in the range 0 to 65535 (0x00-0xffff), or -1 if the end of the stream has been reached.
So in the case of a Reader one should check against EOF like
// Reader r = ...;
int c;
while (-1 != (c=r.read()) {
// use c
}
In the case of a BufferedReader and readLine(), it may be
String s;
while (null != (s=br.readLine())) {
// use s
}
because readLine() returns null on EOF.
Use this function:
public static boolean eof(Reader r) throws IOException {
r.mark(1);
int i = r.read();
r.reset();
return i < 0;
}
A standard pattern for what you are trying to do is:
BufferedReader r = new BufferedReader(new FileReader(myFile));
String s = r.readLine();
while (s != null) {
// do something with s
s = r.readLine();
}
r.close();
the ready() method will not work. You must read from the stream and check the return value to see if you are at EOF.

Categories