Buffering from a file - java

I need to read from a file that contain 9000 words, what is the best way to read from this file and what is the difference between bufferingreader aND regular scanner.. or is there other good class to use?
Thanks

If you are doing "efficient" reading, there is no benefit to buffering. If, on the other hand, you are doing "inefficient" reading, then having a buffer will improve performance.
What do I mean by "efficient" reading? Efficient reading means reading bytes off of the InputStream / Reader as fast as they appear. Imagine you wanted to load a whole text file to display in an IDE or other editor. "inefficient" reading is when you are reading information off of the stream piecemeal - ie Scanner.nextDouble() is inefficient reading, as it reads in a few bytes (until the double's digits end), then transforms the number from text to binary. In this case, having a buffer improves performance, as the next call to nextDouble() will read out of the buffer (memory) instead of disk
If you have any questions on this, please ask

Open the file using an input stream. Then read its content to a string using this code:
public static void main(String args[]) throws IOException {
FileInputStream in = null;
in = new FileInputStream("input.txt");
String text = inputStreamToString(is);
}
// Reads an InputStream and converts it to a String.
public String inputStreamToString(InputStream stream) throws IOException {
ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream();
byte[] buffer = new byte[1024];
int length;
while((length = stream.read(buffer)) != -1)
byteArrayOutputStream.write(buffer,0,length);
return byteArrayOutputStream.toString("UTF-8");
}
Check this answer for comparisons between buffered readers:
Read/convert an InputStream to a String
I normally use Scanners when I want to read a file line by line, or based on a delimiter. For example:
try {
Scanner fileScanner = new Scanner(System.in);
File file = new File("file.txt");
fileScanner = new Scanner(file);
while (fileScanner.hasNextLine()) {
String line = fileScanner.nextLine();
System.out.println(line);
}
fileScanner.close();
} catch (Exception ex) {
ex.printStackTrace();
}
To scan based on a delimiter, you can use something similar to this:
fileScanner.userDelimiter("\\s"); // \s matches any whitespace
while(fileScanner.hasNext()){
//do something with scanned data
String word = fileScanner.next();
//Double num = fileScanner.nextDouble();
}

Related

BufferedReader vs Scanner, and FileInputStream vs FileReader?

Can someone explain to me why can I use FileInputStream or FileReader for a BufferedReader? What's the difference? And at the same time what is the advantage of a Scanner over a BufferedReader? I was reading that it helps by tokenizing, but what does that mean?
try {
//Simple reading of bytes
FileInputStream fileInputStream = new FileInputStream("path to file");
byte[] arr = new byte[1024];
int actualBytesRead = fileInputStream.read(arr, 0, arr.length);
//Can read characters and lines now
BufferedReader bufferedReader = new BufferedReader(new InputStreamReader(fileInputStream));
String lineRead = bufferedReader.readLine();
char [] charArrr = new char[1024];
int actulCharsRead = bufferedReader.read(charArrr, 0, charArrr.length);
//File reader allows reading of characters from a file
FileReader fileReader = new FileReader("path to file");
actulCharsRead = fileReader.read(charArrr, 0, charArrr.length);
//It is a good idea to wrap a bufferedReader around a fileReader
BufferedReader betterFileReader = new BufferedReader(new FileReader(""));
lineRead = betterFileReader.readLine();
actulCharsRead = betterFileReader.read(charArrr, 0, charArrr.length);
//allows reading int, long, short, byte, line etc. Scanner tends to be very slow
Scanner scanner = new Scanner("path to file");
//can also give inputStream as source
scanner = new Scanner(System.in);
long valueRead = scanner.nextLong();
//might wanna check out javadoc for more info
} catch (IOException e) {
e.printStackTrace();
}
Dexter's answer is already useful, but some extra explanation might still help:
In genereal:
An InputStream only provides access to byte data from a source.
A Reader can be wrapped around a stream and adds proper text encoding, so you can now read chars.
A BufferedReader can be wrapped around a Reader to buffer operations, so instead of 1 byte per call, it reads a bunch at once, thereby reducing system calls and improving performance in most cases.
For files:
A FileInputStream is the most basic way to read data from files.
If you do not want to handle text encoding on your own, you can wrap it into a InputStreamReader, which can be wrapped into a BufferedReader.
Alternatively, you can use a FilerReader, which should basically do the same thing as FileInputStream + InputStreamReader.
Now if you do not want to just read arbitrary text, but specific data types (int, long, double,...) or regular expressions, Scanner is quite useful. But as mentioned, it will add some overhead for building those expressions, so only use it when needed.
Introduced in Java 8 is Files.lines. This supports sufficient simple file manipulation to relieve at least some Perl envy :-)
Files.lines(Paths.get("input.txt"))
.filter(line -> line.startsWith("ERROR:"))
.map(String::toUpperCase).forEach(System.out::println);

How to remove first line of a text file in java [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Replace first line of a text file in Java
Java - Find a line in a file and remove
I am trying to find a way to remove the first line of text in a text file using java. Would like to use a scanner to do it...is there a good way to do it without the need of a tmp file?
Thanks.
If your file is huge, you can use the following method that is performing the remove, in place, without using a temp file or loading all the content into memory.
public static void removeFirstLine(String fileName) throws IOException {
RandomAccessFile raf = new RandomAccessFile(fileName, "rw");
//Initial write position
long writePosition = raf.getFilePointer();
raf.readLine();
// Shift the next lines upwards.
long readPosition = raf.getFilePointer();
byte[] buff = new byte[1024];
int n;
while (-1 != (n = raf.read(buff))) {
raf.seek(writePosition);
raf.write(buff, 0, n);
readPosition += n;
writePosition += n;
raf.seek(readPosition);
}
raf.setLength(writePosition);
raf.close();
}
Note that if your program is terminated while in the middle of the above loop you can end up with duplicated lines or corrupted file.
Scanner fileScanner = new Scanner(myFile);
fileScanner.nextLine();
This will return the first line of text from the file and discard it because you don't store it anywhere.
To overwrite your existing file:
FileWriter fileStream = new FileWriter("my/path/for/file.txt");
BufferedWriter out = new BufferedWriter(fileStream);
while(fileScanner.hasNextLine()) {
String next = fileScanner.nextLine();
if(next.equals("\n"))
out.newLine();
else
out.write(next);
out.newLine();
}
out.close();
Note that you will have to be catching and handling some IOExceptions this way. Also, the if()... else()... statement is necessary in the while() loop to keep any line breaks present in your text file.
Without temp file you must keep everything in main memory. The rest is straight forward: loop over the lines (ignoring the first) and store them in a collection. Then write the lines back to disk:
File path = new File("/path/to/file.txt");
Scanner scanner = new Scanner(path);
ArrayList<String> coll = new ArrayList<String>();
scanner.nextLine();
while (scanner.hasNextLine()) {
String line = scanner.nextLine();
coll.add(line);
}
scanner.close();
FileWriter writer = new FileWriter(path);
for (String line : coll) {
writer.write(line);
}
writer.close();
If file is not too big, you can read is into a byte array, find first new line symbol and write the rest of array into the file starting from position zero. Or you may use memory mapped file to do so.

How can I read a .txt file into a single Java string while maintaining line breaks?

Virtually every code example out there reads a TXT file line-by-line and stores it in a String array. I do not want line-by-line processing because I think it's an unnecessary waste of resources for my requirements: All I want to do is quickly and efficiently dump the .txt contents into a single String. The method below does the job, however with one drawback:
private static String readFileAsString(String filePath) throws java.io.IOException{
byte[] buffer = new byte[(int) new File(filePath).length()];
BufferedInputStream f = null;
try {
f = new BufferedInputStream(new FileInputStream(filePath));
f.read(buffer);
if (f != null) try { f.close(); } catch (IOException ignored) { }
} catch (IOException ignored) { System.out.println("File not found or invalid path.");}
return new String(buffer);
}
... the drawback is that the line breaks are converted into long spaces e.g. " ".
I want the line breaks to be converted from \n or \r to <br> (HTML tag) instead.
Thank you in advance.
What about using a Scanner and adding the linefeeds yourself:
sc = new java.util.Scanner ("sample.txt")
while (sc.hasNext ()) {
buf.append (sc.nextLine ());
buf.append ("<br />");
}
I don't see where you get your long spaces from.
You can read directly into the buffer and then create a String from the buffer:
File f = new File(filePath);
FileInputStream fin = new FileInputStream(f);
byte[] buffer = new byte[(int) f.length()];
new DataInputStream(fin).readFully(buffer);
fin.close();
String s = new String(buffer, "UTF-8");
You could add this code:
return new String(buffer).replaceAll("(\r\n|\r|\n|\n\r)", "<br>");
Is this what you are looking for?
The code will read the file contents as they appear in the file - including line breaks.
If you want to change the breaks into something else like displaying in html etc, you will either need to post process it or do it by reading the file line by line. Since you do not want the latter, you can replace your return by following which should do the conversion -
return (new String(buffer)).replaceAll("\r[\n]?", "<br>");
StringBuilder sb = new StringBuilder();
try {
InputStream is = getAssets().open("myfile.txt");
byte[] bytes = new byte[1024];
int numRead = 0;
try {
while((numRead = is.read(bytes)) != -1)
sb.append(new String(bytes, 0, numRead));
}
catch(IOException e) {
}
is.close();
}
catch(IOException e) {
}
your resulting String: String result = sb.toString();
then replace whatever you want in this result.
I agree with the general approach by #Sanket Patel, but using Commons I/O you would likely want File Utils.
So your code word look like:
String myString = FileUtils.readFileToString(new File(filePath));
There is also another version to specify an alternate character encoding.
You should try org.apache.commons.io.IOUtils.toString(InputStream is) to get file content as String. There you can pass InputStream object which you will get from
getAssets().open("xml2json.txt") *<<- belongs to Android, which returns InputStream*
in your Activity. To get String use this :
String xml = IOUtils.toString((getAssets().open("xml2json.txt")));
So,
String xml = IOUtils.toString(*pass_your_InputStream_object_here*);

Capture data read from file into string stream Java

I'm coming from a C++ background, so be kind on my n00bish queries...
I'd like to read data from an input file and store it in a stringstream. I can accomplish this in an easy way in C++ using stringstreams. I'm a bit lost trying to do the same in Java.
Following is a crude code/way I've developed where I'm storing the data read line-by-line in a string array. I need to use a string stream to capture my data into (rather than use a string array).. Any help?
char dataCharArray[] = new char[2];
int marker=0;
String inputLine;
String temp_to_write_data[] = new String[100];
// Now, read from output_x into stringstream
FileInputStream fstream = new FileInputStream("output_" + dataCharArray[0]);
// Convert our input stream to a BufferedReader
BufferedReader in = new BufferedReader (new InputStreamReader(fstream));
// Continue to read lines while there are still some left to read
while ((inputLine = in.readLine()) != null )
{
// Print file line to screen
// System.out.println (inputLine);
temp_to_write_data[marker] = inputLine;
marker++;
}
EDIT:
I think what I really wanted was a StringBuffer.
I need to read data from a file (into a StringBuffer, probably) and write/transfer all the data back to another file.
In Java, first preference should always be given to buying code from the library houses:
http://commons.apache.org/io/api-1.4/org/apache/commons/io/IOUtils.html
http://commons.apache.org/io/api-1.4/org/apache/commons/io/FileUtils.html
In short, what you need is this:
FileUtils.readFileToString(File file)
StringBuffer is one answer, but if you're just writing it to another file, then you can just open an OutputStream and write it directly out to the other file. Holding a whole file in memory is probably not a good idea.
In you simply want to read a file and write another one:
BufferedInputStream in = new BufferedInputStream( new FileInputStream( "in.txt" ) );
BufferedOutputStream out = new BufferedOutputStream( new FileOutputStream( "out.txt" ) );
int b;
while ( (b = in.read()) != -1 ) {
out.write( b );
}
If you want to read a file into a string:
StringWriter out = new StringWriter();
BufferedReader in = new BufferedReader( new FileReader( "in.txt" ) );
int c;
while ( (c = in.read()) != -1 ) {
out.write( c );
}
StringBuffer buf = out.getBuffer();
This can be made more efficient if you read using byte arrays. But I recommend that you use the excellent apache common-io. IOUtils (http://commons.apache.org/io/api-1.4/org/apache/commons/io/IOUtils.html) will do the loop for you.
Also, you should remember to close the streams.
I also come from C++, and I was looking for a class similar to the C++ 'StringStreamReader', but I couldn't find it. In my case (which I think was very simple), I was trying to read a file line by line and then read a String and an Integer from each of these lines. My final solution was to use two objects of the class java.util.Scanner, so that I could use one of them to read the lines of the file directly to a String and use the second one to re-read the content of each line (now in the String) to the variables (a new String and a positive 'int'). Here's my code:
try {
//"path" is a String containing the path of the file we want to read
Scanner sc = new Scanner(new BufferedReader(new FileReader(new File(path))));
while (sc.hasNextLine()) { //while the file isn't over
Scanner scLine = new Scanner(sc.nextLine());
//sc.nextLine() returns the next line of the file into a String
//scLine will now proceed to scan (i.e. analyze) the content of the string
//and identify the string and the positive 'int' (what in C++ would be an 'unsigned int')
String s = scLine.next(); //this returns the string wanted
int x;
if (!scLine.hasNextInt() || (x = scLine.nextInt()) < 0) return false;
//scLine.hasNextInt() analyzes if the following pattern can be interpreted as an int
//scLine.nextInt() reads the int, and then we check if it is positive or not
//AT THIS POINT, WE ALREADY HAVE THE VARIABLES WANTED AND WE CAN DO
//WHATEVER WE WANT WITH THEM
//in my case, I put them into a HashMap called 'hm'
hm.put(s, x);
}
sc.close();
//we finally close the scanner to point out that we won't need it again 'till the next time
} catch (Exception e) {
return false;
}
return true;
Hope that helped.

Prepend lines to file in Java

Is there a way to prepend a line to the File in Java, without creating a temporary file, and writing the needed content to it?
No, there is no way to do that SAFELY in Java. (Or AFAIK, any other programming language.)
No filesystem implementation in any mainstream operating system supports this kind of thing, and you won't find this feature supported in any mainstream programming languages.
Real world file systems are implemented on devices that store data as fixed sized "blocks". It is not possible to implement a file system model where you can insert bytes into the middle of a file without significantly slowing down file I/O, wasting disk space or both.
The solutions that involve an in-place rewrite of the file are inherently unsafe. If your application is killed or the power dies in the middle of the prepend / rewrite process, you are likely to lose data. I would NOT recommend using that approach in practice.
Use a temporary file and renaming. It is safer.
There is a way, it involves rewriting the whole file though (but no temporary file). As others mentioned, no file system supports prepending content to a file. Here is some sample code that uses a RandomAccessFile to write and read content while keeping some content buffered in memory:
public static void main(final String args[]) throws Exception {
File f = File.createTempFile(Main.class.getName(), "tmp");
f.deleteOnExit();
System.out.println(f.getPath());
// put some dummy content into our file
BufferedWriter w = new BufferedWriter(new OutputStreamWriter(new FileOutputStream(f)));
for (int i = 0; i < 1000; i++) {
w.write(UUID.randomUUID().toString());
w.write('\n');
}
w.flush();
w.close();
// append "some uuids" to our file
int bufLength = 4096;
byte[] appendBuf = "some uuids\n".getBytes();
byte[] writeBuf = appendBuf;
byte[] readBuf = new byte[bufLength];
int writeBytes = writeBuf.length;
RandomAccessFile rw = new RandomAccessFile(f, "rw");
int read = 0;
int write = 0;
while (true) {
// seek to read position and read content into read buffer
rw.seek(read);
int bytesRead = rw.read(readBuf, 0, readBuf.length);
// seek to write position and write content from write buffer
rw.seek(write);
rw.write(writeBuf, 0, writeBytes);
// no bytes read - end of file reached
if (bytesRead < 0) {
// end of
break;
}
// update seek positions for write and read
read += bytesRead;
write += writeBytes;
writeBytes = bytesRead;
// reuse buffer, create new one to replace (short) append buf
byte[] nextWrite = writeBuf == appendBuf ? new byte[bufLength] : writeBuf;
writeBuf = readBuf;
readBuf = nextWrite;
};
rw.close();
// now show the content of our file
BufferedReader reader = new BufferedReader(new InputStreamReader(new FileInputStream(f)));
String line;
while ((line = reader.readLine()) != null) {
System.out.println(line);
}
}
You could store the file content in a String and prepend the desired line by using a StringBuilder-Object. You just have to put the desired line first and then append the file-content-String.
No extra temporary file needed.
No. There are no "intra-file shift" operations, only read and write of discrete sizes.
It would be possible to do so by reading a chunk of the file of equal length to what you want to prepend, writing the new content in place of it, reading the later chunk and replacing it with what you read before, and so on, rippling down the to the end of the file.
However, don't do that, because if anything stops (out-of-memory, power outage, rogue thread calling System.exit) in the middle of that process, data will be lost. Use the temporary file instead.
private static void addPreAppnedText(File fileName) {
FileOutputStream fileOutputStream =null;
BufferedReader br = null;
FileReader fr = null;
String newFileName = fileName.getAbsolutePath() + "#";
try {
fileOutputStream = new FileOutputStream(newFileName);
fileOutputStream.write("preappendTextDataHere".getBytes());
fr = new FileReader(fileName);
br = new BufferedReader(fr);
String sCurrentLine;
while ((sCurrentLine = br.readLine()) != null) {
fileOutputStream.write(("\n"+sCurrentLine).getBytes());
}
fileOutputStream.flush();
} catch (IOException e) {
e.printStackTrace();
} finally {
try {
fileOutputStream.close();
if (br != null)
br.close();
if (fr != null)
fr.close();
new File(newFileName).renameTo(new File(newFileName.replace("#", "")));
} catch (IOException ex) {
ex.printStackTrace();
}
}
}

Categories