I would like to know the specific difference between BufferedReader and FileReader.
I do know that BufferedReader is much more efficient as opposed to FileReader, but can someone please explain why (specifically and in detail)? Thanks.
First, You should understand "streaming" in Java because all "Readers" in Java are built upon this concept.
File Streaming
File streaming is carried out by the FileInputStream object in Java.
// it reads a byte at a time and stores into the 'byt' variable
int byt;
while((byt = fileInputStream.read()) != -1) {
fileOutputStream.write(byt);
}
This object reads a byte(8-bits) at a time and writes it to the given file.
A practical useful application of it would be to work with raw binary/data files, such as images or audio files (use AudioInputStream instead of FileInputStream for audio files).
On the other hand, it is very inconvenient and slower for text files, because of looping through a byte at a time, then do some processing and store the processed byte back is tedious and time-consuming.
You also need to provide the character set of the text file, i.e if the characters are in Latin or Chinese, etc. Otherwise, the program would decode and encode 8-bits at a time and you'd see weird chars printed on the screen or written in the output file (if a char is more than 1 byte long, i.e. non-ASCII characters).
File Reading
This is just a fancy way of saying "File streaming" with inclusive charset support (i.e no need to define the charset, like earlier).
The FileReader class is specifically designed to deal with the text files.
As you've seen earlier, the file streaming is best to deal with raw binary data, but for the sake of text, it is not so efficient.
So the Java-dudes added the FileReader class, to deal specifically with the text files. It reads 2 bytes (or 4 bytes, depends on the charset) at a time. A remarkably huge improvement over the preceding FileInputStream!!
so the streaming operation is like this,
int c;
while ( (c = fileReader.read()) != -1) { // some logic }
Please note, Both classes use an integer variable to store the value retrieved from the input file (so every char is converted into an integer while fetching and back to the char while storing).
The only advantage here is that this class deals only with text files, so you don't have to specify the charset and a few other properties. It provides an out-of-the-box solution, for most of the text files processing cases. It also supports internationalization and localization.
But again it's still very slow (Imaging reading 2 bytes at a time and looping through it!).
Buffering streams
To tackle the problem of continuous looping over a byte or 2. The Java-dudes added another spectacular functionality. "To create a buffer of data, before processing."
The concept is pretty much alike when a user streams a video on YouTube. A video is buffered before playing, to provide flawless video watching experience. (Tho, the browser keeps buffering until the whole video is buffered ahead of time.) The same technique is used by the BufferedReader class.
A BufferedReader object takes a FileReader object as an input which contains all the necessary information about the text file that needs to be read. (such as the file path and charset.)
BufferedReader br = new BufferedReader( new FileReader("example.txt") );
When the "read" instruction is given to the BufferedReader object, it uses the FileReader object to read the data from the file. When an instruction is given, the FileReader object reads 2 (or 4) bytes at a time and returns the data to the BufferedReader and the reader keeps doing that until it hits '\n' or '\r\n' (The end of the line symbol).
Once a line is buffered, the reader waits patiently, until the instruction to buffer the next line is given.
Meanwhile, The BufferReader object creates a special memory place (On the RAM), called "Buffer", and stores all the fetched data from the FileReader object.
// this variable points to the buffered line
String line;
// Keep buffering the lines and print it.
while ((line = br.readLine()) != null) {
printWriter.println(line);
}
Now here, instead of reading 2 bytes at a time, a whole line is fetched and stored in the RAM somewhere, and when you are done with processing the data, you can store the whole line back to the hard disk. So it makes the process run way faster than doing 2 bytes a time.
But again, why do we need to pass FileReader object to the BufferReader? Can't we just say "buffer this file" and the BufferReader would take care of the rest? wouldn't that be sweet?
Well, the BufferReader class is created in a way that it only knows how to create a buffer and to store incoming data. It is irrelevant to the object from where the data is coming. So the same object can be used for many other input streams than just text files.
So being said that, When you provide the FileReader object as an input, it buffers the file, the same way if you provide the InputStreamReader as an object, it buffers the Terminal/Console input data until it hits a newline symbol. such as,
// Object that reads console inputs
InputStreamReader console = new InputStreamReader(System.in);
BufferedReader br = new BufferedReader(console);
System.out.println(br.readLine());
This way, you can read (or buffer) multiple streams with the same BufferReader class, such as text files, consoles, printers, networking data etc, and all you have to remember is,
bufferedReader.readLine();
to print whatever you've buffered.
In simple manner:
A FileReader class is a general tool to read in characters from a File. The BufferedReader class can wrap around Readers, like FileReader, to buffer the input and improve efficiency. So you wouldn't use one over the other, but both at the same time by passing the FileReader object to the BufferedReader constructor.
Very Detail
FileReader is used for input of character data from a disk file. The input file can be an ordinary ASCII, one byte per character text file. A Reader stream automatically translates the characters from the disk file format into the internal char format. The characters in the input file might be from other alphabets supported by the UTF format, in which case there will be up to three bytes per character. In this case, also, characters from the file are translated into char format.
As with output, it is good practice to use a buffer to improve efficiency. Use BufferedReader for this. This is the same class we've been using for keyboard input. These lines should look familiar:
BufferedReader stdin =
new BufferedReader(new InputStreamReader( System.in ));
These lines create a BufferedReader, but connect it to an input stream from the keyboard, not to a file.
Source: http://www.oopweb.com/Java/Documents/JavaNotes/Volume/chap84/ch84_3.html
BufferedReader requires a Reader, of which FileReader is one - it descends from InputStreamReader, which descends from Reader.
FileReader - read character files
BufferedReader - "Read text from a character-input stream, buffering characters so as to provide for the efficient reading of characters, arrays, and lines."
http://docs.oracle.com/javase/7/docs/api/java/io/BufferedReader.html
http://docs.oracle.com/javase/7/docs/api/java/io/FileReader.html
Actually BufferedReader makes use of Readers like FileReader.
FileReader class helps in writing on file but its efficency is low since it has yo retrive one character at a time from file but BufferedReader takes chunks of data and store it in buffer so instead of retriving one character at atime from file retrival becomes easy using buffer.
Bufferedreader - method that you can use actually as a substitute for Scanner method, gets file, gets input.
FileReader - as the name suggests.
Related
I would like to know the specific difference between BufferedReader and FileReader.
I do know that BufferedReader is much more efficient as opposed to FileReader, but can someone please explain why (specifically and in detail)? Thanks.
First, You should understand "streaming" in Java because all "Readers" in Java are built upon this concept.
File Streaming
File streaming is carried out by the FileInputStream object in Java.
// it reads a byte at a time and stores into the 'byt' variable
int byt;
while((byt = fileInputStream.read()) != -1) {
fileOutputStream.write(byt);
}
This object reads a byte(8-bits) at a time and writes it to the given file.
A practical useful application of it would be to work with raw binary/data files, such as images or audio files (use AudioInputStream instead of FileInputStream for audio files).
On the other hand, it is very inconvenient and slower for text files, because of looping through a byte at a time, then do some processing and store the processed byte back is tedious and time-consuming.
You also need to provide the character set of the text file, i.e if the characters are in Latin or Chinese, etc. Otherwise, the program would decode and encode 8-bits at a time and you'd see weird chars printed on the screen or written in the output file (if a char is more than 1 byte long, i.e. non-ASCII characters).
File Reading
This is just a fancy way of saying "File streaming" with inclusive charset support (i.e no need to define the charset, like earlier).
The FileReader class is specifically designed to deal with the text files.
As you've seen earlier, the file streaming is best to deal with raw binary data, but for the sake of text, it is not so efficient.
So the Java-dudes added the FileReader class, to deal specifically with the text files. It reads 2 bytes (or 4 bytes, depends on the charset) at a time. A remarkably huge improvement over the preceding FileInputStream!!
so the streaming operation is like this,
int c;
while ( (c = fileReader.read()) != -1) { // some logic }
Please note, Both classes use an integer variable to store the value retrieved from the input file (so every char is converted into an integer while fetching and back to the char while storing).
The only advantage here is that this class deals only with text files, so you don't have to specify the charset and a few other properties. It provides an out-of-the-box solution, for most of the text files processing cases. It also supports internationalization and localization.
But again it's still very slow (Imaging reading 2 bytes at a time and looping through it!).
Buffering streams
To tackle the problem of continuous looping over a byte or 2. The Java-dudes added another spectacular functionality. "To create a buffer of data, before processing."
The concept is pretty much alike when a user streams a video on YouTube. A video is buffered before playing, to provide flawless video watching experience. (Tho, the browser keeps buffering until the whole video is buffered ahead of time.) The same technique is used by the BufferedReader class.
A BufferedReader object takes a FileReader object as an input which contains all the necessary information about the text file that needs to be read. (such as the file path and charset.)
BufferedReader br = new BufferedReader( new FileReader("example.txt") );
When the "read" instruction is given to the BufferedReader object, it uses the FileReader object to read the data from the file. When an instruction is given, the FileReader object reads 2 (or 4) bytes at a time and returns the data to the BufferedReader and the reader keeps doing that until it hits '\n' or '\r\n' (The end of the line symbol).
Once a line is buffered, the reader waits patiently, until the instruction to buffer the next line is given.
Meanwhile, The BufferReader object creates a special memory place (On the RAM), called "Buffer", and stores all the fetched data from the FileReader object.
// this variable points to the buffered line
String line;
// Keep buffering the lines and print it.
while ((line = br.readLine()) != null) {
printWriter.println(line);
}
Now here, instead of reading 2 bytes at a time, a whole line is fetched and stored in the RAM somewhere, and when you are done with processing the data, you can store the whole line back to the hard disk. So it makes the process run way faster than doing 2 bytes a time.
But again, why do we need to pass FileReader object to the BufferReader? Can't we just say "buffer this file" and the BufferReader would take care of the rest? wouldn't that be sweet?
Well, the BufferReader class is created in a way that it only knows how to create a buffer and to store incoming data. It is irrelevant to the object from where the data is coming. So the same object can be used for many other input streams than just text files.
So being said that, When you provide the FileReader object as an input, it buffers the file, the same way if you provide the InputStreamReader as an object, it buffers the Terminal/Console input data until it hits a newline symbol. such as,
// Object that reads console inputs
InputStreamReader console = new InputStreamReader(System.in);
BufferedReader br = new BufferedReader(console);
System.out.println(br.readLine());
This way, you can read (or buffer) multiple streams with the same BufferReader class, such as text files, consoles, printers, networking data etc, and all you have to remember is,
bufferedReader.readLine();
to print whatever you've buffered.
In simple manner:
A FileReader class is a general tool to read in characters from a File. The BufferedReader class can wrap around Readers, like FileReader, to buffer the input and improve efficiency. So you wouldn't use one over the other, but both at the same time by passing the FileReader object to the BufferedReader constructor.
Very Detail
FileReader is used for input of character data from a disk file. The input file can be an ordinary ASCII, one byte per character text file. A Reader stream automatically translates the characters from the disk file format into the internal char format. The characters in the input file might be from other alphabets supported by the UTF format, in which case there will be up to three bytes per character. In this case, also, characters from the file are translated into char format.
As with output, it is good practice to use a buffer to improve efficiency. Use BufferedReader for this. This is the same class we've been using for keyboard input. These lines should look familiar:
BufferedReader stdin =
new BufferedReader(new InputStreamReader( System.in ));
These lines create a BufferedReader, but connect it to an input stream from the keyboard, not to a file.
Source: http://www.oopweb.com/Java/Documents/JavaNotes/Volume/chap84/ch84_3.html
BufferedReader requires a Reader, of which FileReader is one - it descends from InputStreamReader, which descends from Reader.
FileReader - read character files
BufferedReader - "Read text from a character-input stream, buffering characters so as to provide for the efficient reading of characters, arrays, and lines."
http://docs.oracle.com/javase/7/docs/api/java/io/BufferedReader.html
http://docs.oracle.com/javase/7/docs/api/java/io/FileReader.html
Actually BufferedReader makes use of Readers like FileReader.
FileReader class helps in writing on file but its efficency is low since it has yo retrive one character at a time from file but BufferedReader takes chunks of data and store it in buffer so instead of retriving one character at atime from file retrival becomes easy using buffer.
Bufferedreader - method that you can use actually as a substitute for Scanner method, gets file, gets input.
FileReader - as the name suggests.
I cannot really grasp what the purpose is for the FileReader and BufferedReader classes in Java.
At docs.oracle one is recommended to wrap a buffered reader around a FileReader object because it's not efficient to use the FileReader directly. Where does the cost or overhead come from?
Let say I have a text file that I want to read into my java program using these classes:
I use FileReader and BufferedReader
FileReader fileReader = new FileReader(new File("text.txt)"); // probably correct???
BufferedReader bufferedReader = new BufferedReader(fileReader);
1) What is the task of the FileReader object here? Is it responsible to make an I/O-request via the OS to the file, and thereafter read bytes? What is the cost with this?
Is it true that the FileReader makes several I/O-requests? Or is the cost when the FileReader object has to converting bytes to characters, character by character?
2) The task of the BufferedReader-object - to refer to the last sentence above. - Is the role for the BufferedReader-object to simply buffer arrays of incoming bytes and THEN convert those to character?
Very grateful for answers
Edit: first of all thanks for incoming answers. But I should have mentioned that its exactly this documentation I have studied. Call me stupid or something - but what is meant by "each read request". WHEN is each read request made? how often?
In general, each read request made of a Reader causes a corresponding read request to be made of the underlying character or byte stream. It is therefore advisable to wrap a BufferedReader around any Reader whose read() operations may be costly, such as FileReaders and InputStreamReaders. For example,
This is mostly why a launched this question - It sounds that the FileReader causes a lot of I/O-request which slows erverthing down.
From oracle docs :
In general, each read request made of a Reader causes a corresponding
read request to be made of the underlying character or byte stream. It
is therefore advisable to wrap a BufferedReader around any Reader
whose read() operations may be costly, such as FileReaders and
InputStreamReaders. For example,
BufferedReader in = new BufferedReader(new FileReader("foo.in"));
will buffer the input from the specified file. Without buffering, each
invocation of read() or readLine() could cause bytes to be read from
the file, converted into characters, and then returned, which can be
very inefficient.
So, as the document clearly suggests, Wrapping a BufferedReader around a FileReader prevents reading of data from the file over and over again. BufferedReader buffers the input.
Java has many I/O classes which may be combined. So you might see something like:
BufferedReader in = new BufferedReader(new InputStreamReader(
new FileInputStream(new File("..."), "UTF-8"));
...
in.close(); // Closes all.
This allows flexible combination. So an XML parser could use Reader, and not care where the text comes from: file, URL, memory. This as opposed to "simpler" languages, where there is no choice of implementation (Map with implementations HashMap, TreeMap).
Now, FileReader and FileWriter are old utility classes to read from and write to a file in the default operating system encoding. So for local files. Not portable (!) to other operating systems, or requiring a fixed encoding.
For this FileReader extends InputStreamReader which reads binary data (an InputStream) using a FileInputStream. It does just that.
However it makes sense to use a larger memory buffer to read, hence the advice; which also is long-standing. Remember performance was an issue in the early times of java.
The only advantage of FileReader-standalone would be for tight memory situations, maybe on a smart phone.
Sorry if this question is a dulplicate but I didn't get an answer I was looking for.
Java docs says this
In general, each read request made of a Reader causes a corresponding read request to be made of the underlying character or byte stream. It is therefore advisable to wrap a BufferedReader around any Reader whose read() operations may be costly, such as FileReaders >and InputStreamReaders. For example,
BufferedReader in = new BufferedReader(new FileReader("foo.in"));
will buffer the input from the specified file. Without buffering, each invocation of read() or readLine() could cause bytes to be read from the file, converted into characters, and then returned, which can be very inefficient.
My first question is If bufferedReader can read bytes then why can't we work on images which are in bytes using bufferedreader.
My second question is Does Bufferedreader store characters in BUFFER and what is the meaning of this line
will buffer the input from the specified file.
My third question is what is the meaning of this line
In general, each read request made of a Reader causes a corresponding read request to be >made of the underlying character or byte stream.
There are two questions here.
1. Buffering
Imagine you lived a mile from your nearest water source, and you drink a cup of water every hour. Well, you wouldn't walk all the way to the water for every cup. Go once a day, and come home with a bucket full of enough water to fill the cup 24 times.
The bucket is a buffer.
Imagine your village is supplied water by a river. But sometimes the river runs dry for a month; other times the river brings so much water that the village floods. So you build a dam, and behind the dam there is a reservoir. The reservoir fills up in the rainy season and gradually empties in the dry season. The village gets a steady flow of water all year round.
The reservoir is a buffer.
Data streams in computing are similar to both those scenarios. For example, you can get several kilobytes of data from a filesystem in a single OS system call, but if you want to process one character at a time, you need something similar to a reservoir.
A BufferedReader contains within it another Reader (for example a FileReader), which is the river -- and an array of bytes, which is the reservoir. Every time you read from it, it does something like:
if there are not enough bytes in the "reservoir" to fulfil this request
top up the "reservoir" by reading from the underlying Reader
endif
return some bytes from the "reservoir".
However when you use a BufferedReader, you don't need to know how it works, only that it works.
2. Suitability for images
It's important to understand that BufferedReader and FileReader are examples of Readers. You might not have covered polymorphism in your programming education yet, so when you do, remember this. It means that if you have code which uses FileReader -- but only the aspects of it that conform to Reader -- then you can substitute a BufferedReader and it will work just the same.
It's a good habit to declare variables as the most general class that works:
Reader reader = new FileReader(file);
... because then this would be the only change you need to add buffering:
Reader reader = new BufferedReader(new FileReader(file));
I took that detour because it's all Readers that are less suitable for images.
Reader has two read methods:
int read(); // returns one character, cast to an int
int read(char[] block); // reads into block, returns how many chars it read
The second form is unsuitable for images because it definitely reads chars, not ints.
The first form looks as if it might be OK -- after all, it reads ints. And indeed, if you just use a FileReader, it might well work.
However, think about how a BufferedReader wrapped around a FileReader will work. The first time you call BufferedReader.read(), it will call FileReader.read(buffer) to fill its buffer. Then it will cast the first char of the buffer back to an int, and return that.
Especially when you bring multi-byte charsets into the picture, that can cause problems.
So if you want to read integers, use InputStream not Reader. InputStream has int read(byte[] buf, int offset, int length) -- bytes are much more reliably cast back and forth from int than chars.
Readers (and Writers) in java are specialized classes for dealing with text (character) streams - the concept of a line is meaningless in any other type of stream.
for the general IO equivalent, have a look at BufferedInputStream
so, to answer your questions:
while the reader does eventually read bytes, it converts them to characters. it is not intended to read anything else (like images) - use the InputStream family of classes for that
a buffered reader will read large blocks of data from the underlying stream (which may be a file, socket, or anything else) into a buffer in memory and will then serve read requests from this buffer until the buffer is emptied. this behaviour of reading large chunks instead of smaller chucks every time improves performance.
it means that if you dont wrap a reader in a buffered reader then every time you want to read a single character, it will access the disk.network to get just the single character you want. doing I/O in such small chunks is usually terrible for performance.
Default behaviour is it will convert to character, but when you have an image you cannot have a character data, instead you need pixel of bytes data. So you cannot use it.
It is buffereing, means , it is reading a certain chunk of data in an char array. You can see this behaviour in the code:
public BufferedReader(Reader in) {
this(in, defaultCharBufferSize);
}
and the defaultCharBufferSize is as mentioned below:
private static int defaultCharBufferSize = 8192;
3 Every time you do read operation, it will be reading only one character.
So in a nutshell, buffred means, it will read few chunk of character data first that will keep in a char array and that will be processed and again it will read same chunk of data until it reaches end of stream
You can refer the following to get to know more
BufferedReader
I am getting a FilterInputStream object as a return type from a function. Now the file which I will be getting as an stream is a log file. So I think it can be big file. So I do not want to read the data all at once. But reading data in a loop is kind of tedious job.
I need splitting at every newline, meaning data in file is in line separated format. With a constant size byte array to be used in public int read(byte[], int off, int len) as it will give rise to many cases. I do not want to read it at once because it can be of large size.
Is there an elegant way to do this.
P.S.: I am in particularly referring to S3ObjectInputStream extended from FilterInputStream which has read() function.
Wrap a BufferedReader around an InputStreamReader around the FilterInputStream and call readLine().
OK, I am sorry, the next floor is right, you can use BufferedReader class, it has a method called readLine,and it returns a String object instead of a byte array. Just like this
BufferedReader reader = new BufferedReader(new FileReader(new File("the file path")));
String date = reader.readLine();
if(!StringUtil.isBlank(date)){
//reade the file line by line
date = reader.readLine();
}
reader.close();
I want to read each line from a text file and store them in an ArrayList (each line being one entry in the ArrayList).
So far I understand that a BufferedInputStream writes to the buffer and only does another read once the buffer is empty which minimises or at least reduces the amount of operating system operations.
Am I correct - do I make sense?
If the above is the case in what situations would anyone want to use DataInputStream. And finally which of the two should I be using and why - or does it not matter.
Use a normal InputStream (e.g. FileInputStream) wrapped in an InputStreamReader and then wrapped in a BufferedReader - then call readLine on the BufferedReader.
DataInputStream is good for reading primitives, length-prefixed strings etc.
The two classes are not mutually exclusive - you can use both of them if your needs suit.
As you picked up, BufferedInputStream is about reading in blocks of data rather than a single byte at a time. It also provides the convenience method of readLine(). However, it's also used for peeking at data further in the stream then rolling back to a previous part of the stream if required (see the mark() and reset() methods).
DataInputStream/DataOutputStream provides convenience methods for reading/writing certain data types. For example, it has a method to write/read a UTF String. If you were to do this yourself, you'd have to decide on how to determine the end of the String (i.e. with a terminator byte or by specifying the length of the string).
This is different from BufferedInputStream's readLine() which, as the method sounds like, only returns a single line. writeUTF()/readUTF() deal with Strings - that string can have as many lines it it as it wants.
BufferedInputStream is suitable for most text processing purposes. If you're doing something special like trying to serialize the fields of a class to a file, you'd want to use DataInput/OutputStream as it offers greater control of the data at a binary level.
Hope that helps.
You can always use both:
final InputStream inputStream = ...;
final BufferedInputStream bufferedInputStream =
new BufferedInputStream(inputStream);
final DataInputStream dataInputStream =
new DataInputStream(bufferedInputStream);
InputStream: Base class to read byte from stream (network or file ), provide ability to read byte from the stream and delete the end of the stream.
DataInputStream: To read data directly as a primitive datatype.
BufferInputStream: Read data from the input stream and use buffer to optimize the speed to access the data.
You shoud use DataInputStream in cases when you need to interpret the primitive types in a file written by a language other Java in platform-independent manner.
I would advocate using Jakarta Commons IO and the readlines() method (of whatever variety).
It'll look after buffering/closing etc. and give you back a list of text lines. I'll happily roll my own input stream wrapping with buffering etc., but nine times out of ten the Commons IO stuff works fine and is sufficient/more concise/less error prone etc.
The differences are:
The DataInputStream works with the binary data, while the BufferedReader work with character data.
All primitive data types can be handled by using the corresponding methods in DataInputStream class, while only string data can be read from BufferedReader class and they need to be parsed into the respective primitives.
DataInputStream is a part of filtered streams, while BufferedReader is not.
DataInputStream consumes less amount of memory space being it is a binary stream, whereas BufferedReader consumes more memory space being it is character stream.
The data to be handled is limited in DataInputStream, whereas the number of characters to be handled has wide scope in BufferedReader.